scvi.module.VAE

scvi.module.VAE#

class scvi.module.VAE(n_input, n_batch=0, n_labels=0, n_hidden=128, n_latent=10, n_layers=1, n_continuous_cov=0, n_cats_per_cov=None, dropout_rate=0.1, dispersion='gene', log_variational=True, gene_likelihood='zinb', latent_distribution='normal', encode_covariates=False, deeply_inject_covariates=True, batch_representation='one-hot', use_batch_norm='both', use_layer_norm='none', use_size_factor_key=False, use_observed_lib_size=True, extra_payload_autotune=False, library_log_means=None, library_log_vars=None, var_activation=None, extra_encoder_kwargs=None, extra_decoder_kwargs=None, batch_embedding_kwargs=None)[source]#

Bases: EmbeddingModuleMixin, BaseMinifiedModeModuleClass

Variational auto-encoder [Lopez et al., 2018].

Parameters:

n_input (int) – Number of input features.
n_batch (int (default: 0)) – Number of batches. If 0, no batch correction is performed.
n_labels (int (default: 0)) – Number of labels.
n_hidden (int (default: 128)) – Number of nodes per hidden layer. Passed into Encoder and DecoderSCVI.
n_latent (int (default: 10)) – Dimensionality of the latent space.
n_layers (int (default: 1)) – Number of hidden layers. Passed into Encoder and DecoderSCVI.
n_continuous_cov (int (default: 0)) – Number of continuous covariates.
n_cats_per_cov (list[int] | None (default: None)) – A list of integers containing the number of categories for each categorical covariate.
dropout_rate (float (default: 0.1)) – Dropout rate. Passed into Encoder but not DecoderSCVI.
dispersion (Literal['gene', 'gene-batch', 'gene-label', 'gene-cell'] (default: 'gene')) –
Flexibility of the dispersion parameter when gene_likelihood is either "nb" or "zinb". One of the following:
- "gene": parameter is constant per gene across cells.
- "gene-batch": parameter is constant per gene per batch.
- "gene-label": parameter is constant per gene per label.
- "gene-cell": parameter is constant per gene per cell.
log_variational (bool (default: True)) – If True, use log1p() on input data before encoding for numerical stability (not normalization).
gene_likelihood (Literal['zinb', 'nb', 'poisson', 'normal'] (default: 'zinb')) –
Distribution to use for reconstruction in the generative process. One of the following:
- "nb": NegativeBinomial.
- "zinb": ZeroInflatedNegativeBinomial.
- "poisson": Poisson.
- "normal": Normal.
latent_distribution (Literal['normal', 'ln'] (default: 'normal')) –
Distribution to use for the latent space. One of the following:
- "normal": isotropic normal.
- "ln": logistic normal with normal params N(0, 1).
encode_covariates (bool (default: False)) – If True, covariates are concatenated to gene expression prior to passing through the encoder(s). Else, only the gene expression is used.
deeply_inject_covariates (bool (default: True)) – If True and n_layers > 1, covariates are concatenated to the outputs of hidden layers in the encoder(s) (if encoder_covariates is True) and the decoder prior to passing through the next layer.
batch_representation (Literal['one-hot', 'embedding'] (default: 'one-hot')) –
Method for encoding batch information. One of the following:
- "one-hot": represent batches with one-hot encodings.
- "embedding": represent batches with continuously-valued embeddings using Embedding.
Note that batch representations are only passed into the encoder(s) if encode_covariates is True.
use_batch_norm (Literal['encoder', 'decoder', 'none', 'both'] (default: 'both')) –
Specifies where to use BatchNorm1d in the model. One of the following:
- "none": don’t use batch norm in either encoder(s) or decoder.
- "encoder": use batch norm only in the encoder(s).
- "decoder": use batch norm only in the decoder.
- "both": use batch norm in both encoder(s) and decoder.
Note: if use_layer_norm is also specified, both will be applied (first BatchNorm1d, then LayerNorm).
use_layer_norm (Literal['encoder', 'decoder', 'none', 'both'] (default: 'none')) –
Specifies where to use LayerNorm in the model. One of the following:
- "none": don’t use layer norm in either encoder(s) or decoder.
- "encoder": use layer norm only in the encoder(s).
- "decoder": use layer norm only in the decoder.
- "both": use layer norm in both encoder(s) and decoder.
Note: if use_batch_norm is also specified, both will be applied (first BatchNorm1d, then LayerNorm).
use_size_factor_key (bool (default: False)) – If True, use the obs column as defined by the size_factor_key parameter in the model’s setup_anndata method as the scaling factor in the mean of the conditional distribution. Takes priority over use_observed_lib_size.
use_observed_lib_size (bool (default: True)) – If True, use the observed library size for RNA as the scaling factor in the mean of the conditional distribution.
extra_payload_autotune (bool (default: False)) – If True, will return extra matrices in the loss output to be used during autotune
library_log_means (ndarray | None (default: None)) – ndarray of shape (1, n_batch) of means of the log library sizes that parameterize the prior on library size if use_size_factor_key is False and use_observed_lib_size is False.
library_log_vars (ndarray | None (default: None)) – ndarray of shape (1, n_batch) of variances of the log library sizes that parameterize the prior on library size if use_size_factor_key is False and use_observed_lib_size is False.
var_activation (Callable[[Tensor], Tensor] (default: None)) – Callable used to ensure positivity of the variance of the variational distribution. Passed into Encoder. Defaults to exp().
extra_encoder_kwargs (dict | None (default: None)) – Additional keyword arguments passed into Encoder.
extra_decoder_kwargs (dict | None (default: None)) – Additional keyword arguments passed into DecoderSCVI.
batch_embedding_kwargs (dict | None (default: None)) – Keyword arguments passed into Embedding if batch_representation is set to "embedding".

Attributes table#

training

Methods table#

`generative`(z, library, batch_index[, ...])	Run the generative process.
`loss`(tensors, inference_outputs, ...[, ...])	Compute the loss.
`marginal_ll`(tensors, n_mc_samples[, ...])	Compute the marginal log-likelihood of the data under the model.
`sample`(tensors[, n_samples, ...])	Generate predictive samples from the posterior predictive distribution.

Attributes#

VAE.training: bool#

Methods#

VAE.generative(z, library, batch_index, cont_covs=None, cat_covs=None, size_factor=None, y=None, transform_batch=None)[source]#

Run the generative process.

Return type:: dict[str, Distribution | None]

VAE.loss(tensors, inference_outputs, generative_outputs, kl_weight=1.0)[source]#

Compute the loss.

Return type:: LossOutput

VAE.marginal_ll(tensors, n_mc_samples, return_mean=False, n_mc_samples_per_pass=1)[source]#

Compute the marginal log-likelihood of the data under the model.

Parameters:

tensors (dict[str, Tensor]) – Dictionary of tensors passed into VAE.forward.
n_mc_samples (int) – Number of Monte Carlo samples to use for the estimation of the marginal log-likelihood.
return_mean (bool (default: False)) – Whether to return the mean of marginal likelihoods over cells.
n_mc_samples_per_pass (int (default: 1)) – Number of Monte Carlo samples to use per pass. This is useful to avoid memory issues.

VAE.sample(tensors, n_samples=1, max_poisson_rate=100000000.0, generative_kwargs=None)[source]#

Generate predictive samples from the posterior predictive distribution.

The posterior predictive distribution is denoted as \(p(\hat{x} \mid x)\), where \(x\) is the input data and \(\hat{x}\) is the sampled data.

We sample from this distribution by first sampling n_samples times from the posterior distribution \(q(z \mid x)\) for a given observation, and then sampling from the likelihood \(p(\hat{x} \mid z)\) for each of these.

Parameters:

tensors (dict[str, Tensor]) – Dictionary of tensors passed into VAE.forward.
n_samples (int (default: 1)) – Number of Monte Carlo samples to draw from the distribution for each observation.
max_poisson_rate (float (default: 100000000.0)) – The maximum value to which to clip the rate parameter of Poisson. Avoids numerical sampling issues when the parameter is very large due to the variance of the distribution.
generative_kwargs (dict | None (default: None)) – Keyword args for generative() in fwd pass

Return type:

Tensor

Returns:

Tensor on CPU with shape (n_obs, n_vars) if n_samples == 1, else (n_obs, n_vars,).

scvi.module.VAE

Contents

scvi.module.VAE#

Attributes table#

Methods table#

Attributes#

Methods#