class scvi.module.TOTALVAE(n_input_genes, n_input_proteins, n_batch=0, n_labels=0, n_hidden=256, n_latent=20, n_layers_encoder=2, n_layers_decoder=1, n_continuous_cov=0, n_cats_per_cov=None, dropout_rate_decoder=0.2, dropout_rate_encoder=0.2, gene_dispersion='gene', protein_dispersion='protein', log_variational=True, gene_likelihood='nb', latent_distribution='normal', protein_batch_mask=None, encode_covariates=True, protein_background_prior_mean=None, protein_background_prior_scale=None, use_observed_lib_size=True, use_batch_norm='both', use_layer_norm='none')[source]

Bases: scvi.module.base._base_module.BaseModuleClass

Total variational inference for CITE-seq data.

Implements the totalVI model of [GayosoSteier21].

n_input_genes : intint

Number of input genes

n_input_proteins : intint

Number of input proteins

n_batch : intint (default: 0)

Number of batches

n_labels : intint (default: 0)

Number of labels

n_hidden : intint (default: 256)

Number of nodes per hidden layer for encoder and decoder

n_latent : intint (default: 20)

Dimensionality of the latent space


Number of hidden layers used for encoder and decoder NNs

n_continuous_cov : intint (default: 0)

Number of continuous covarites

n_cats_per_cov : Iterable[int] | NoneOptional[Iterable[int]] (default: None)

Number of categories for each extra categorical covariate


Dropout rate for neural networks

gene_dispersion : strstr (default: 'gene')

One of the following

  • 'gene' - genes_dispersion parameter of NB is constant per gene across cells

  • 'gene-batch' - genes_dispersion can differ between different batches

  • 'gene-label' - genes_dispersion can differ between different labels

protein_dispersion : strstr (default: 'protein')

One of the following

  • 'protein' - protein_dispersion parameter is constant per protein across cells

  • 'protein-batch' - protein_dispersion can differ between different batches NOT TESTED

  • 'protein-label' - protein_dispersion can differ between different labels NOT TESTED

log_variational : boolbool (default: True)

Log(data+1) prior to encoding for numerical stability. Not normalization.

gene_likelihood : strstr (default: 'nb')

One of

  • 'nb' - Negative binomial distribution

  • 'zinb' - Zero-inflated negative binomial distribution

latent_distribution : strstr (default: 'normal')

One of

  • 'normal' - Isotropic normal

  • 'ln' - Logistic normal with normal params N(0, 1)

protein_batch_mask : {str | int: ndarray} | NoneOptional[Dict[Union[str, int], ndarray]] (default: None)

Dictionary where each key is a batch code, and value is for each protein, whether it was observed or not.

encode_covariates : boolbool (default: True)

Whether to concatenate covariates to expression in encoder

protein_background_prior_mean : ndarray | NoneOptional[ndarray] (default: None)

Array of proteins by batches, the prior initialization for the protein background mean (log scale)

protein_background_prior_scale : ndarray | NoneOptional[ndarray] (default: None)

Array of proteins by batches, the prior initialization for the protein background scale (log scale)

use_observed_lib_size : boolbool (default: True)

Use observed library size for RNA as scaling factor in mean of conditional distribution


generative(z, library_gene, batch_index, label)

Run the generative model.

get_reconstruction_loss(x, y, px_dict, py_dict)

Compute reconstruction loss.

get_sample_dispersion(x, y[, batch_index, …])

Returns the tensors of dispersions for genes and proteins.

inference(x, y[, batch_index, label, …])

Internal helper function to compute necessary inference quantities.

loss(tensors, inference_outputs, …[, …])

Returns the reconstruction loss and the Kullback divergences.

marginal_ll(tensors, n_mc_samples)

sample(tensors[, n_samples])

Generate samples from the learned model.