scvi.external.scviva.nicheVAE#

class scvi.external.scviva.nicheVAE(n_input, n_output_niche, n_batch=0, n_labels=0, n_hidden=128, n_latent=10, n_layers=1, n_layers_niche=1, n_layers_compo=1, n_hidden_niche=128, n_hidden_compo=128, n_continuous_cov=0, n_cats_per_cov=None, dropout_rate=0.1, dispersion='gene', log_variational=True, gene_likelihood='poisson', latent_distribution='normal', niche_likelihood='gaussian', cell_rec_weight=1.0, latent_kl_weight=1.0, spatial_weight=10, prior_mixture=False, prior_mixture_k=20, semisupervised=True, linear_classifier=True, inpute_covariates_niche_decoder=True, encode_covariates=False, deeply_inject_covariates=True, batch_representation='one-hot', use_batch_norm='none', use_layer_norm='both', use_size_factor_key=False, use_observed_lib_size=True, library_log_means=None, library_log_vars=None, batch_embedding_kwargs=None, extra_decoder_kwargs=None, extra_encoder_kwargs=None, **vae_kwargs)[source]#

Bases: VAE

Variational auto-encoder with niche decoders [Levy et al., 2025].

Parameters:
  • n_input (int) – Number of input features.

  • n_batch (int (default: 0)) – Number of batches. If 0, no batch correction is performed.

  • n_labels (int (default: 0)) – Number of labels.

  • n_hidden (int (default: 128)) – Number of nodes per hidden layer. Passed into Encoder and DecoderSCVI.

  • n_latent (int (default: 10)) – Dimensionality of the latent space.

  • n_layers (int (default: 1)) – Number of hidden layers. Passed into Encoder and DecoderSCVI.

  • n_layers_niche (int (default: 1)) – Number of hidden layers in the niche state decoder.

  • n_layers_compo (int (default: 1)) – Number of hidden layers in the composition decoder.

  • n_hidden_niche (int (default: 128)) – Number of nodes per hidden layer in the niche state decoder.

  • n_hidden_compo (int (default: 128)) – Number of nodes per hidden layer in the composition decoder.

  • n_continuous_cov (int (default: 0)) – Number of continuous covariates.

  • n_cats_per_cov (list[int] | None (default: None)) – A list of integers containing the number of categories for each categorical covariate.

  • dropout_rate (float (default: 0.1)) – Dropout rate. Passed into Encoder but not DecoderSCVI.

  • dispersion (Literal['gene', 'gene-batch', 'gene-label', 'gene-cell'] (default: 'gene')) –

    Flexibility of the dispersion parameter when gene_likelihood is either "nb" or "zinb". One of the following:

    • "gene": parameter is constant per gene across cells.

    • "gene-batch": parameter is constant per gene per batch.

    • "gene-label": parameter is constant per gene per label.

    • "gene-cell": parameter is constant per gene per cell.

  • log_variational (bool (default: True)) – If True, use log1p() on input data before encoding for numerical stability (not normalization).

  • gene_likelihood (Literal['zinb', 'nb', 'poisson'] (default: 'poisson')) –

    Distribution to use for reconstruction in the generative process. One of the following:

  • latent_distribution (Literal['normal', 'ln'] (default: 'normal')) –

    Distribution to use for the latent space. One of the following:

    • "normal": isotropic normal.

    • "ln": logistic normal with normal params N(0, 1).

  • niche_likelihood (Literal['poisson', 'gaussian'] (default: 'gaussian')) –

    Distribution to use for the niche state. One of the following:

    • "poisson": Poisson.

    • "gaussian": Normal.

    Default is "gaussian" and Poisson should be used if the niche state is count data.

  • cell_rec_weight (float (default: 1.0)) – Weight of the cell reconstruction loss.

  • latent_kl_weight (float (default: 1.0)) – Weight of the latent KL divergence.

  • spatial_weight (float (default: 10)) – Weight of the spatial losses

  • prior_mixture (bool (default: False)) – If True, use a mixture of Gaussians for the latent space. Else, use unimodal Gaussian.

  • prior_mixture_k (int (default: 20)) – Number of components in the Gaussian mixture.

  • semisupervised (bool (default: True)) – If True, use a classifier to predict cell type labels from the latent space.

  • linear_classifier (bool (default: True)) – If True, use a linear classifier. Else, use a neural network.

  • inpute_covariates_niche_decoder (bool (default: True)) – If True, covariates are concatenated to the input of the niche state decoder.

  • encode_covariates (bool (default: False)) – If True, covariates are concatenated to gene expression prior to passing through the encoder(s). Else, only gene expression is used.

  • deeply_inject_covariates (bool (default: True)) – If True and n_layers > 1, covariates are concatenated to the outputs of hidden layers in the encoder(s) (if encoder_covariates is True) and the decoder prior to passing through the next layer.

  • batch_representation (Literal['one-hot', 'embedding'] (default: 'one-hot')) –

    EXPERIMENTAL Method for encoding batch information. One of the following:

    • "one-hot": represent batches with one-hot encodings.

    • "embedding": represent batches with continuously-valued embeddings using Embedding.

    Note that batch representations are only passed into the encoder(s) if encode_covariates is True.

  • use_batch_norm (Literal['encoder', 'decoder', 'none', 'both'] (default: 'none')) –

    Specifies where to use BatchNorm1d in the model. One of the following:

    • "none": don’t use batch norm in either encoder(s) or decoder.

    • "encoder": use batch norm only in the encoder(s).

    • "decoder": use batch norm only in the decoder.

    • "both": use batch norm in both encoder(s) and decoder.

    Note: if use_layer_norm is also specified, both will be applied (first BatchNorm1d, then LayerNorm).

  • use_layer_norm (Literal['encoder', 'decoder', 'none', 'both'] (default: 'both')) –

    Specifies where to use LayerNorm in the model. One of the following:

    • "none": don’t use layer norm in either encoder(s) or decoder.

    • "encoder": use layer norm only in the encoder(s).

    • "decoder": use layer norm only in the decoder.

    • "both": use layer norm in both encoder(s) and decoder.

    Note: if use_batch_norm is also specified, both will be applied (first BatchNorm1d, then LayerNorm).

  • use_size_factor_key (bool (default: False)) – If True, use the obs column as defined by the size_factor_key parameter in the model’s setup_anndata method as the scaling factor in the mean of the conditional distribution. Takes priority over use_observed_lib_size.

  • use_observed_lib_size (bool (default: True)) – If True, use the observed library size for RNA as the scaling factor in the mean of the conditional distribution.

  • library_log_means (ndarray | None (default: None)) – ndarray of shape (1, n_batch) of means of the log library sizes that parameterize the prior on library size if use_size_factor_key is False and use_observed_lib_size is False.

  • library_log_vars (ndarray | None (default: None)) – ndarray of shape (1, n_batch) of variances of the log library sizes that parameterize the prior on library size if use_size_factor_key is False and use_observed_lib_size is False.

  • extra_decoder_kwargs (dict | None (default: None)) – Additional keyword arguments passed into DecoderSCVI.

  • batch_embedding_kwargs (dict | None (default: None)) – Keyword arguments passed into Embedding if batch_representation is set to "embedding".

Notes

Lifecycle: argument batch_representation is experimental in v1.2.

Attributes table#

Methods table#

generative(z, library, batch_index[, ...])

Run the generative process.

loss(tensors, inference_outputs, ...[, ...])

Compute the loss.

Attributes#

nicheVAE.training: bool#

Methods#

nicheVAE.generative(z, library, batch_index, cont_covs=None, cat_covs=None, size_factor=None, y=None, transform_batch=None)[source]#

Run the generative process.

Return type:

dict[str, Distribution | None]

nicheVAE.loss(tensors, inference_outputs, generative_outputs, kl_weight=1.0, classification_ratio=50, epsilon=1e-06, n_samples_mixture=10)[source]#

Compute the loss.

Return type:

NicheLossOutput