scvi.external.totalanvi.TOTALANVAE#

class scvi.external.totalanvi.TOTALANVAE(n_input_genes, n_input_proteins, n_batch=1, n_labels=1, n_hidden=256, n_latent=20, n_layers_encoder=2, n_layers_decoder=1, n_continuous_cov=0, n_cats_per_cov=None, dropout_rate_decoder=0.2, dropout_rate_encoder=0.2, gene_dispersion='gene', protein_dispersion='protein', log_variational=True, gene_likelihood='nb', latent_distribution='normal', protein_batch_mask=None, encode_covariates=True, protein_background_prior_mean=None, protein_background_prior_scale=None, use_size_factor_key=False, use_observed_lib_size=True, extra_payload_autotune=False, library_log_means=None, library_log_vars=None, n_panel=None, panel_key='batch', use_batch_norm='both', use_layer_norm='none', extra_encoder_kwargs=None, extra_decoder_kwargs=None, y_prior=None, labels_groups=None, linear_classifier=False, classifier_parameters=None)[source]#

Bases: SupervisedModuleClass, TOTALVAE

Total variational inference for CITE-seq data.

Implements a combination of scANVI and totalVI model of [Gayoso et al., 2021].

Parameters:
  • n_input_genes (int) – Number of input genes

  • n_input_proteins (int) – Number of input proteins

  • n_batch (int (default: 1)) – Number of batches

  • n_labels (int (default: 1)) – Number of labels

  • n_hidden (int (default: 256)) – Number of nodes per hidden layer for encoder and decoder

  • n_latent (int (default: 20)) – Dimensionality of the latent space

  • n_layers – Number of hidden layers used for encoder and decoder NNs

  • n_continuous_cov (int (default: 0)) – Number of continuous covariates

  • n_cats_per_cov (Iterable[int] | None (default: None)) – Number of categories for each extra categorical covariate

  • dropout_rate – Dropout rate for neural networks

  • gene_dispersion (Literal['gene', 'gene-batch', 'gene-label'] (default: 'gene')) –

    One of the following

    • 'gene' - genes_dispersion parameter of NB is constant per gene across cells

    • 'gene-batch' - genes_dispersion can differ between different batches

    • 'gene-label' - genes_dispersion can differ between different labels

  • protein_dispersion (Literal['protein', 'protein-batch', 'protein-label'] (default: 'protein')) –

    One of the following

    • 'protein' - protein_dispersion parameter is constant per protein across cells

    • 'protein-batch' - protein_dispersion can differ between different batches NOT TESTED

    • 'protein-label' - protein_dispersion can differ between different labels NOT TESTED

  • log_variational (bool (default: True)) – Log(data+1) prior to encoding for numerical stability. Not normalization.

  • gene_likelihood (Literal['zinb', 'nb'] (default: 'nb')) –

    One of

    • 'nb' - Negative binomial distribution

    • 'zinb' - Zero-inflated negative binomial distribution

  • latent_distribution (Literal['normal', 'ln'] (default: 'normal')) –

    One of

    • 'normal' - Isotropic normal

    • 'ln' - Logistic normal with normal params N(0, 1)

  • protein_batch_mask (dict[str | int, ndarray] (default: None)) – Dictionary where each key is a batch code, and value is for each protein, whether it was observed or not.

  • encode_covariates (bool (default: True)) – Whether to concatenate covariates to expression in encoder

  • protein_background_prior_mean (ndarray | None (default: None)) – Array of proteins by batches, the prior initialization for the protein background mean (log scale)

  • protein_background_prior_scale (ndarray | None (default: None)) – Array of proteins by batches, the prior initialization for the protein background scale (log scale)

  • use_size_factor_key (bool (default: False)) – Use size_factor AnnDataField defined by the user as a scaling factor in mean of conditional distribution. Takes priority over use_observed_lib_size.

  • use_observed_lib_size (bool (default: True)) – Use observed library size for RNA as a scaling factor in mean of conditional distribution

  • extra_payload_autotune (bool (default: False)) – If True, returns extra matrices in the loss output to be used during autotune

  • library_log_means (ndarray | None (default: None)) – 1 x n_batch array of means of the log library sizes. Parameterizes prior on library size if not using observed library size.

  • library_log_vars (ndarray | None (default: None)) – 1 x n_batch array of variances of the log library sizes. Parameterizes prior on library size if not using observed library size.

  • use_batch_norm (Literal['encoder', 'decoder', 'none', 'both'] (default: 'both')) – Whether to use batch norm in layers.

  • use_layer_norm (Literal['encoder', 'decoder', 'none', 'both'] (default: 'none')) – Whether to use layer norm in layers.

  • extra_encoder_kwargs (dict | None (default: None)) – Extra keyword arguments passed into EncoderTOTALVI.

  • extra_decoder_kwargs (dict | None (default: None)) – Extra keyword arguments passed into DecoderTOTALVI.

  • linear_classifier (bool (default: False)) – If True, uses a single linear layer for classification instead of a multi-layer perceptron.

Attributes table#

Methods table#

classify(x, y[, batch_index, cont_covs, ...])

Forward pass through the encoder and classifier.

loss(tensors, inference_outputs, ...[, ...])

Returns the reconstruction loss and the Kullback divergences

Attributes#

TOTALANVAE.training: bool#

Methods#

TOTALANVAE.classify(x, y, batch_index=None, cont_covs=None, cat_covs=None, use_posterior_mean=True)[source]#

Forward pass through the encoder and classifier.

Parameters:
  • x (Tensor) – Tensor of shape (n_obs, n_genes).

  • y (Tensor) – Tensor of shape (n_obs, n_proteins).

  • batch_index (Tensor | None (default: None)) – Tensor of shape (n_obs,) denoting batch indices.

  • cont_covs (Tensor | None (default: None)) – Tensor of shape (n_obs, n_continuous_covariates).

  • cat_covs (Tensor | None (default: None)) – Tensor of shape (n_obs, n_categorical_covariates).

  • use_posterior_mean (bool (default: True)) – Whether to use the posterior mean of the latent distribution for classification.

Return type:

Tensor

Returns:

Tensor of shape (n_obs, n_labels) denoting logit scores per label. Before v1.1, this method by default returned probabilities per label, see #2301 for more details.

TOTALANVAE.loss(tensors, inference_outputs, generative_outputs, pro_recons_weight=1.0, kl_weight=1.0, labelled_tensors=None, classification_ratio=None)[source]#

Returns the reconstruction loss and the Kullback divergences

Return type:

tuple[FloatTensor, FloatTensor, FloatTensor, FloatTensor]