scvi.module.VAE#
- class scvi.module.VAE(n_input, n_batch=0, n_labels=0, n_hidden=128, n_latent=10, n_layers=1, n_continuous_cov=0, n_cats_per_cov=None, dropout_rate=0.1, dispersion='gene', log_variational=True, gene_likelihood='zinb', latent_distribution='normal', encode_covariates=False, deeply_inject_covariates=True, batch_representation='one-hot', use_batch_norm='both', use_layer_norm='none', use_size_factor_key=False, use_observed_lib_size=True, extra_payload_autotune=False, library_log_means=None, library_log_vars=None, var_activation=None, extra_encoder_kwargs=None, extra_decoder_kwargs=None, batch_embedding_kwargs=None)[source]#
Bases:
EmbeddingModuleMixin,BaseMinifiedModeModuleClassVariational auto-encoder [Lopez et al., 2018].
- Parameters:
n_input (
int) – Number of input features.n_batch (
int(default:0)) – Number of batches. If0, no batch correction is performed.n_labels (
int(default:0)) – Number of labels.n_hidden (
int(default:128)) – Number of nodes per hidden layer. Passed intoEncoderandDecoderSCVI.n_latent (
int(default:10)) – Dimensionality of the latent space.n_layers (
int(default:1)) – Number of hidden layers. Passed intoEncoderandDecoderSCVI.n_continuous_cov (
int(default:0)) – Number of continuous covariates.n_cats_per_cov (
list[int] |None(default:None)) – A list of integers containing the number of categories for each categorical covariate.dropout_rate (
float(default:0.1)) – Dropout rate. Passed intoEncoderbut notDecoderSCVI.dispersion (
Literal['gene','gene-batch','gene-label','gene-cell'] (default:'gene')) –Flexibility of the dispersion parameter when
gene_likelihoodis either"nb"or"zinb". One of the following:"gene": parameter is constant per gene across cells."gene-batch": parameter is constant per gene per batch."gene-label": parameter is constant per gene per label."gene-cell": parameter is constant per gene per cell.
log_variational (
bool(default:True)) – IfTrue, uselog1p()on input data before encoding for numerical stability (not normalization).gene_likelihood (
Literal['zinb','nb','poisson'] (default:'zinb')) –Distribution to use for reconstruction in the generative process. One of the following:
"nb":NegativeBinomial."zinb":ZeroInflatedNegativeBinomial."poisson":Poisson."normal":Normal.
latent_distribution (
Literal['normal','ln'] (default:'normal')) –Distribution to use for the latent space. One of the following:
"normal": isotropic normal."ln": logistic normal with normal params N(0, 1).
encode_covariates (
bool(default:False)) – IfTrue, covariates are concatenated to gene expression prior to passing through the encoder(s). Else, only gene expression is used.deeply_inject_covariates (
bool(default:True)) – IfTrueandn_layers > 1, covariates are concatenated to the outputs of hidden layers in the encoder(s) (ifencoder_covariatesisTrue) and the decoder prior to passing through the next layer.batch_representation (
Literal['one-hot','embedding'] (default:'one-hot')) –EXPERIMENTALMethod for encoding batch information. One of the following:"one-hot": represent batches with one-hot encodings."embedding": represent batches with continuously-valued embeddings usingEmbedding.
Note that batch representations are only passed into the encoder(s) if
encode_covariatesisTrue.use_batch_norm (
Literal['encoder','decoder','none','both'] (default:'both')) –Specifies where to use
BatchNorm1din the model. One of the following:"none": don’t use batch norm in either encoder(s) or decoder."encoder": use batch norm only in the encoder(s)."decoder": use batch norm only in the decoder."both": use batch norm in both encoder(s) and decoder.
Note: if
use_layer_normis also specified, both will be applied (firstBatchNorm1d, thenLayerNorm).use_layer_norm (
Literal['encoder','decoder','none','both'] (default:'none')) –Specifies where to use
LayerNormin the model. One of the following:"none": don’t use layer norm in either encoder(s) or decoder."encoder": use layer norm only in the encoder(s)."decoder": use layer norm only in the decoder."both": use layer norm in both encoder(s) and decoder.
Note: if
use_batch_normis also specified, both will be applied (firstBatchNorm1d, thenLayerNorm).use_size_factor_key (
bool(default:False)) – IfTrue, use theobscolumn as defined by thesize_factor_keyparameter in the model’ssetup_anndatamethod as the scaling factor in the mean of the conditional distribution. Takes priority overuse_observed_lib_size.use_observed_lib_size (
bool(default:True)) – IfTrue, use the observed library size for RNA as the scaling factor in the mean of the conditional distribution.extra_payload_autotune (
bool(default:False)) – IfTrue, will return extra matrices in the loss output to be used during autotunelibrary_log_means (
ndarray|None(default:None)) –ndarrayof shape(1, n_batch)of means of the log library sizes that parameterize the prior on library size ifuse_size_factor_keyisFalseanduse_observed_lib_sizeisFalse.library_log_vars (
ndarray|None(default:None)) –ndarrayof shape(1, n_batch)of variances of the log library sizes that parameterize the prior on library size ifuse_size_factor_keyisFalseanduse_observed_lib_sizeisFalse.var_activation (
Callable[[Tensor],Tensor] (default:None)) – Callable used to ensure positivity of the variance of the variational distribution. Passed intoEncoder. Defaults toexp().extra_encoder_kwargs (
dict|None(default:None)) – Additional keyword arguments passed intoEncoder.extra_decoder_kwargs (
dict|None(default:None)) – Additional keyword arguments passed intoDecoderSCVI.batch_embedding_kwargs (
dict|None(default:None)) – Keyword arguments passed intoEmbeddingifbatch_representationis set to"embedding".
Attributes table#
Methods table#
|
Run the generative process. |
|
Compute the loss. |
|
Compute the marginal log-likelihood of the data under the model. |
|
Generate predictive samples from the posterior predictive distribution. |
Attributes#
- VAE.training: bool#
Methods#
- VAE.generative(z, library, batch_index, cont_covs=None, cat_covs=None, size_factor=None, y=None, transform_batch=None)[source]#
Run the generative process.
- VAE.loss(tensors, inference_outputs, generative_outputs, kl_weight=1.0)[source]#
Compute the loss.
- Return type:
LossOutput
- VAE.marginal_ll(tensors, n_mc_samples, return_mean=False, n_mc_samples_per_pass=1)[source]#
Compute the marginal log-likelihood of the data under the model.
- Parameters:
tensors (
dict[str,Tensor]) – Dictionary of tensors passed intoforward().n_mc_samples (
int) – Number of Monte Carlo samples to use for the estimation of the marginal log-likelihood.return_mean (
bool(default:False)) – Whether to return the mean of marginal likelihoods over cells.n_mc_samples_per_pass (
int(default:1)) – Number of Monte Carlo samples to use per pass. This is useful to avoid memory issues.
- VAE.sample(tensors, n_samples=1, max_poisson_rate=100000000.0, generative_kwargs=None)[source]#
Generate predictive samples from the posterior predictive distribution.
The posterior predictive distribution is denoted as \(p(\hat{x} \mid x)\), where \(x\) is the input data and \(\hat{x}\) is the sampled data.
We sample from this distribution by first sampling
n_samplestimes from the posterior distribution \(q(z \mid x)\) for a given observation, and then sampling from the likelihood \(p(\hat{x} \mid z)\) for each of these.- Parameters:
tensors (
dict[str,Tensor]) – Dictionary of tensors passed intoforward().n_samples (
int(default:1)) – Number of Monte Carlo samples to draw from the distribution for each observation.max_poisson_rate (
float(default:100000000.0)) – The maximum value to which to clip therateparameter ofPoisson. Avoids numerical sampling issues when the parameter is very large due to the variance of the distribution.generative_kwargs (
dict|None(default:None)) – Keyword args forgenerative()in fwd pass
- Return type:
Tensor- Returns:
Tensor on CPU with shape
(n_obs, n_vars)ifn_samples == 1, else(n_obs, n_vars,).