scvi.module.VAE#

class scvi.module.VAE(n_input, n_batch=0, n_labels=0, n_hidden=128, n_latent=10, n_layers=1, n_continuous_cov=0, n_cats_per_cov=None, dropout_rate=0.1, dispersion='gene', log_variational=True, gene_likelihood='zinb', latent_distribution='normal', encode_covariates=False, deeply_inject_covariates=True, use_batch_norm='both', use_layer_norm='none', use_size_factor_key=False, use_observed_lib_size=True, library_log_means=None, library_log_vars=None, var_activation=None)[source]#

Bases: BaseMinifiedModeModuleClass

Variational auto-encoder model.

This is an implementation of the scVI model described in [Lopez et al., 2018].

Parameters:
  • n_input (int) – Number of input genes

  • n_batch (int) – Number of batches, if 0, no batch correction is performed.

  • n_labels (int) – Number of labels

  • n_hidden (Tunable_[int]) – Number of nodes per hidden layer

  • n_latent (Tunable_[int]) – Dimensionality of the latent space

  • n_layers (Tunable_[int]) – Number of hidden layers used for encoder and decoder NNs

  • n_continuous_cov (int) – Number of continuous covarites

  • n_cats_per_cov (Optional[Iterable[int]]) – Number of categories for each extra categorical covariate

  • dropout_rate (Tunable_[float]) – Dropout rate for neural networks

  • dispersion (Tunable_[Literal['gene', 'gene-batch', 'gene-label', 'gene-cell']]) –

    One of the following

    • 'gene' - dispersion parameter of NB is constant per gene across cells

    • 'gene-batch' - dispersion can differ between different batches

    • 'gene-label' - dispersion can differ between different labels

    • 'gene-cell' - dispersion can differ for every gene in every cell

  • log_variational (bool) – Log(data+1) prior to encoding for numerical stability. Not normalization.

  • gene_likelihood (Tunable_[Literal['zinb', 'nb', 'poisson']]) –

    One of

    • 'nb' - Negative binomial distribution

    • 'zinb' - Zero-inflated negative binomial distribution

    • 'poisson' - Poisson distribution

  • latent_distribution (Tunable_[Literal['normal', 'ln']]) –

    One of

    • 'normal' - Isotropic normal

    • 'ln' - Logistic normal with normal params N(0, 1)

  • encode_covariates (Tunable_[bool]) – Whether to concatenate covariates to expression in encoder

  • deeply_inject_covariates (Tunable_[bool]) – Whether to concatenate covariates into output of hidden layers in encoder/decoder. This option only applies when n_layers > 1. The covariates are concatenated to the input of subsequent hidden layers.

  • use_batch_norm (Tunable_[Literal['encoder', 'decoder', 'none', 'both']]) – Whether to use batch norm in layers.

  • use_layer_norm (Tunable_[Literal['encoder', 'decoder', 'none', 'both']]) – Whether to use layer norm in layers.

  • use_size_factor_key (bool) – Use size_factor AnnDataField defined by the user as scaling factor in mean of conditional distribution. Takes priority over use_observed_lib_size.

  • use_observed_lib_size (bool) – Use observed library size for RNA as scaling factor in mean of conditional distribution

  • library_log_means (Optional[ndarray]) – 1 x n_batch array of means of the log library sizes. Parameterizes prior on library size if not using observed library size.

  • library_log_vars (Optional[ndarray]) – 1 x n_batch array of variances of the log library sizes. Parameterizes prior on library size if not using observed library size.

  • var_activation (Optional[Callable]) – Callable used to ensure positivity of the variational distributions’ variance. When None, defaults to torch.exp.

Attributes table#

Methods table#

generative(z, library, batch_index[, ...])

Runs the generative model.

loss(tensors, inference_outputs, ...[, ...])

Computes the loss function for the model.

marginal_ll(tensors, n_mc_samples)

Computes the marginal log likelihood of the model.

sample(tensors[, n_samples, library_size])

Generate observation samples from the posterior predictive distribution.

Attributes#

training

VAE.training: bool#

Methods#

generative

VAE.generative(z, library, batch_index, cont_covs=None, cat_covs=None, size_factor=None, y=None, transform_batch=None)[source]#

Runs the generative model.

loss

VAE.loss(tensors, inference_outputs, generative_outputs, kl_weight=1.0)[source]#

Computes the loss function for the model.

Parameters:

kl_weight (float) –

marginal_ll

VAE.marginal_ll(tensors, n_mc_samples)[source]#

Computes the marginal log likelihood of the model.

sample

VAE.sample(tensors, n_samples=1, library_size=1)[source]#

Generate observation samples from the posterior predictive distribution.

The posterior predictive distribution is written as \(p(\hat{x} \mid x)\).

Parameters:
  • tensors – Tensors dict

  • n_samples – Number of required samples for each cell

  • library_size – Library size to scale samples to

Returns:

x_new : torch.Tensor tensor with shape (n_cells, n_genes, n_samples)

Return type:

ndarray