scvi.external.totalanvi.TOTALANVAE

scvi.external.totalanvi.TOTALANVAE#

class scvi.external.totalanvi.TOTALANVAE(n_input_genes, n_input_proteins, n_batch=1, n_labels=1, n_hidden=256, n_latent=20, n_layers_encoder=2, n_layers_decoder=1, n_continuous_cov=0, n_cats_per_cov=None, dropout_rate_decoder=0.2, dropout_rate_encoder=0.2, gene_dispersion='gene', protein_dispersion='protein', log_variational=True, gene_likelihood='nb', latent_distribution='normal', protein_batch_mask=None, encode_covariates=True, protein_background_prior_mean=None, protein_background_prior_scale=None, use_size_factor_key=False, use_observed_lib_size=True, extra_payload_autotune=False, library_log_means=None, library_log_vars=None, n_panel=None, panel_key='batch', use_batch_norm='both', use_layer_norm='none', extra_encoder_kwargs=None, extra_decoder_kwargs=None, y_prior=None, labels_groups=None, linear_classifier=False, classifier_parameters=None)[source]#

Bases: SupervisedModuleClass, TOTALVAE

Total variational inference for CITE-seq data.

Implements a combination of scANVI and totalVI model of [Gayoso et al., 2021].

Parameters:

n_input_genes (int) – Number of input genes
n_input_proteins (int) – Number of input proteins
n_batch (int (default: 1)) – Number of batches
n_labels (int (default: 1)) – Number of labels
n_hidden (int (default: 256)) – Number of nodes per hidden layer for encoder and decoder
n_latent (int (default: 20)) – Dimensionality of the latent space
n_layers – Number of hidden layers used for encoder and decoder NNs
n_continuous_cov (int (default: 0)) – Number of continuous covariates
n_cats_per_cov (Iterable[int] | None (default: None)) – Number of categories for each extra categorical covariate
dropout_rate – Dropout rate for neural networks
gene_dispersion (Literal['gene', 'gene-batch', 'gene-label'] (default: 'gene')) –
One of the following
- 'gene' - genes_dispersion parameter of NB is constant per gene across cells
- 'gene-batch' - genes_dispersion can differ between different batches
- 'gene-label' - genes_dispersion can differ between different labels
protein_dispersion (Literal['protein', 'protein-batch', 'protein-label'] (default: 'protein')) –
One of the following
- 'protein' - protein_dispersion parameter is constant per protein across cells
- 'protein-batch' - protein_dispersion can differ between different batches NOT TESTED
- 'protein-label' - protein_dispersion can differ between different labels NOT TESTED
log_variational (bool (default: True)) – Log(data+1) prior to encoding for numerical stability. Not normalization.
gene_likelihood (Literal['zinb', 'nb'] (default: 'nb')) –
One of
- 'nb' - Negative binomial distribution
- 'zinb' - Zero-inflated negative binomial distribution
latent_distribution (Literal['normal', 'ln'] (default: 'normal')) –
One of
- 'normal' - Isotropic normal
- 'ln' - Logistic normal with normal params N(0, 1)
protein_batch_mask (dict[str | int, ndarray] (default: None)) – Dictionary where each key is a batch code, and value is for each protein, whether it was observed or not.
encode_covariates (bool (default: True)) – Whether to concatenate covariates to expression in encoder
protein_background_prior_mean (ndarray | None (default: None)) – Array of proteins by batches, the prior initialization for the protein background mean (log scale)
protein_background_prior_scale (ndarray | None (default: None)) – Array of proteins by batches, the prior initialization for the protein background scale (log scale)
use_size_factor_key (bool (default: False)) – Use size_factor AnnDataField defined by the user as a scaling factor in mean of conditional distribution. Takes priority over use_observed_lib_size.
use_observed_lib_size (bool (default: True)) – Use observed library size for RNA as a scaling factor in mean of conditional distribution
extra_payload_autotune (bool (default: False)) – If True, returns extra matrices in the loss output to be used during autotune
library_log_means (ndarray | None (default: None)) – 1 x n_batch array of means of the log library sizes. Parameterizes prior on library size if not using observed library size.
library_log_vars (ndarray | None (default: None)) – 1 x n_batch array of variances of the log library sizes. Parameterizes prior on library size if not using observed library size.
use_batch_norm (Literal['encoder', 'decoder', 'none', 'both'] (default: 'both')) – Whether to use batch norm in layers.
use_layer_norm (Literal['encoder', 'decoder', 'none', 'both'] (default: 'none')) – Whether to use layer norm in layers.
extra_encoder_kwargs (dict | None (default: None)) – Extra keyword arguments passed into EncoderTOTALVI.
extra_decoder_kwargs (dict | None (default: None)) – Extra keyword arguments passed into DecoderTOTALVI.
linear_classifier (bool (default: False)) – If True, uses a single linear layer for classification instead of a multi-layer perceptron.

Attributes table#

training

Methods table#

`classify`(x, y[, batch_index, cont_covs, ...])	Forward pass through the encoder and classifier.
`loss`(tensors, inference_outputs, ...[, ...])	Returns the reconstruction loss and the Kullback divergences

Attributes#

TOTALANVAE.training: bool#

Methods#

TOTALANVAE.classify(x, y, batch_index=None, cont_covs=None, cat_covs=None, use_posterior_mean=True)[source]#

Forward pass through the encoder and classifier.

Parameters:

x (Tensor) – Tensor of shape (n_obs, n_genes).
y (Tensor) – Tensor of shape (n_obs, n_proteins).
batch_index (Tensor | None (default: None)) – Tensor of shape (n_obs,) denoting batch indices.
cont_covs (Tensor | None (default: None)) – Tensor of shape (n_obs, n_continuous_covariates).
cat_covs (Tensor | None (default: None)) – Tensor of shape (n_obs, n_categorical_covariates).
use_posterior_mean (bool (default: True)) – Whether to use the posterior mean of the latent distribution for classification.

Return type:

Tensor

Returns:

Tensor of shape (n_obs, n_labels) denoting logit scores per label. Before v1.1, this method by default returned probabilities per label, see #2301 for more details.

TOTALANVAE.loss(tensors, inference_outputs, generative_outputs, pro_recons_weight=1.0, kl_weight=1.0, labelled_tensors=None, classification_ratio=None)[source]#

Returns the reconstruction loss and the Kullback divergences

Return type:: tuple[FloatTensor, FloatTensor, FloatTensor, FloatTensor]