scvi.data.add_dna_sequence#
- scvi.data.add_dna_sequence(adata, seq_len=1334, genome_name='hg38', genome_dir=None, genome_provider=None, install_genome=True, chr_var_key='chr', start_var_key='start', end_var_key='end', sequence_varm_key='dna_sequence', code_varm_key='dna_code')[source]#
Add DNA sequence to AnnData object.
Uses genomepy under the hood to download the genome.
- Parameters:
adata (AnnData) – AnnData object with chromatin accessiblity data
seq_len (int) – Length of DNA sequence to extract around peak center. Defaults to value used in scBasset.
genome_name (str) – Name of genome to use, installed with genomepy
genome_dir (Optional[Path]) – Directory to install genome to, if not already installed
genome_provider (Optional[str]) – Provider of genome, passed to genomepy
install_genome (bool) – Install the genome with genomepy. If False,
genome_provider
is not used, and a genome is loaded withgenomepy.Genome(genome_name, genomes_dir=genome_dir)
chr_var_key (str) – Key in
.var
for chromosomestart_var_key (str) – Key in
.var
for start positionend_var_key (str) – Key in
.var
for end positionsequence_varm_key (str) – Key in
.varm
for added DNA sequencecode_varm_key (str) – Key in
.varm
for added DNA sequence, encoded as integers
- Returns:
None
- Adds fields to
.varm
: sequence_varm_key: DNA sequence code_varm_key: DNA sequence, encoded as integers
- Adds fields to
- Return type:
None