scvi.data.add_dna_sequence#
- scvi.data.add_dna_sequence(adata, seq_len=1344, genome_name='hg38', genome_dir=None, genome_provider=None, install_genome=True, chr_var_key='chr', start_var_key='start', end_var_key='end', sequence_varm_key='dna_sequence', code_varm_key='dna_code')[source]#
Add DNA sequence to AnnData object.
Uses genomepy under the hood to download the genome.
- Parameters:
adata (
AnnData
) – AnnData object with chromatin accessiblity dataseq_len (
int
(default:1344
)) – Length of DNA sequence to extract around peak center. Defaults to value used in scBasset.genome_name (
str
(default:'hg38'
)) – Name of genome to use, installed with genomepygenome_dir (
Path
|None
(default:None
)) – Directory to install genome to, if not already installedgenome_provider (
str
|None
(default:None
)) – Provider of genome, passed to genomepyinstall_genome (
bool
(default:True
)) – Install the genome with genomepy. If False, genome_provider is not used, and a genome is loaded with genomepy.Genome(genome_name, genomes_dir=genome_dir)chr_var_key (
str
(default:'chr'
)) – Key in .var for chromosomestart_var_key (
str
(default:'start'
)) – Key in .var for start positionend_var_key (
str
(default:'end'
)) – Key in .var for end positionsequence_varm_key (
str
(default:'dna_sequence'
)) – Key in .varm for added DNA sequencecode_varm_key (
str
(default:'dna_code'
)) – Key in .varm for added DNA sequence, encoded as integers
- Return type:
- Returns:
None
- Adds fields to .varm:
sequence_varm_key: DNA sequence code_varm_key: DNA sequence, encoded as integers