scvi.module.TOTALVAE#
- class scvi.module.TOTALVAE(n_input_genes, n_input_proteins, n_batch=0, n_labels=0, n_hidden=256, n_latent=20, n_layers_encoder=2, n_layers_decoder=1, n_continuous_cov=0, n_cats_per_cov=None, dropout_rate_decoder=0.2, dropout_rate_encoder=0.2, gene_dispersion='gene', protein_dispersion='protein', log_variational=True, gene_likelihood='nb', latent_distribution='normal', protein_batch_mask=None, encode_covariates=True, protein_background_prior_mean=None, protein_background_prior_scale=None, use_size_factor_key=False, use_observed_lib_size=True, library_log_means=None, library_log_vars=None, use_batch_norm='both', use_layer_norm='none', extra_encoder_kwargs=None, extra_decoder_kwargs=None)[source]#
Bases:
BaseMinifiedModeModuleClass
Total variational inference for CITE-seq data.
Implements the totalVI model of [Gayoso et al., 2021].
- Parameters:
n_input_genes (
int
) – Number of input genesn_input_proteins (
int
) – Number of input proteinsn_batch (
int
(default:0
)) – Number of batchesn_labels (
int
(default:0
)) – Number of labelsn_hidden (
int
(default:256
)) – Number of nodes per hidden layer for encoder and decodern_latent (
int
(default:20
)) – Dimensionality of the latent spacen_layers – Number of hidden layers used for encoder and decoder NNs
n_continuous_cov (
int
(default:0
)) – Number of continuous covaritesn_cats_per_cov (
Iterable
[int
] |None
(default:None
)) – Number of categories for each extra categorical covariatedropout_rate – Dropout rate for neural networks
gene_dispersion (
Literal
['gene'
,'gene-batch'
,'gene-label'
] (default:'gene'
)) –One of the following
'gene'
- genes_dispersion parameter of NB is constant per gene across cells'gene-batch'
- genes_dispersion can differ between different batches'gene-label'
- genes_dispersion can differ between different labels
protein_dispersion (
Literal
['protein'
,'protein-batch'
,'protein-label'
] (default:'protein'
)) –One of the following
'protein'
- protein_dispersion parameter is constant per protein across cells'protein-batch'
- protein_dispersion can differ between different batches NOT TESTED'protein-label'
- protein_dispersion can differ between different labels NOT TESTED
log_variational (
bool
(default:True
)) – Log(data+1) prior to encoding for numerical stability. Not normalization.gene_likelihood (
Literal
['zinb'
,'nb'
] (default:'nb'
)) –One of
'nb'
- Negative binomial distribution'zinb'
- Zero-inflated negative binomial distribution
latent_distribution (
Literal
['normal'
,'ln'
] (default:'normal'
)) –One of
'normal'
- Isotropic normal'ln'
- Logistic normal with normal params N(0, 1)
protein_batch_mask (
dict
[str
|int
,ndarray
] (default:None
)) – Dictionary where each key is a batch code, and value is for each protein, whether it was observed or not.encode_covariates (
bool
(default:True
)) – Whether to concatenate covariates to expression in encoderprotein_background_prior_mean (
ndarray
|None
(default:None
)) – Array of proteins by batches, the prior initialization for the protein background mean (log scale)protein_background_prior_scale (
ndarray
|None
(default:None
)) – Array of proteins by batches, the prior initialization for the protein background scale (log scale)use_size_factor_key (
bool
(default:False
)) – Use size_factor AnnDataField defined by the user as scaling factor in mean of conditional distribution. Takes priority over use_observed_lib_size.use_observed_lib_size (
bool
(default:True
)) – Use observed library size for RNA as scaling factor in mean of conditional distributionlibrary_log_means (
ndarray
|None
(default:None
)) – 1 x n_batch array of means of the log library sizes. Parameterizes prior on library size if not using observed library size.library_log_vars (
ndarray
|None
(default:None
)) – 1 x n_batch array of variances of the log library sizes. Parameterizes prior on library size if not using observed library size.use_batch_norm (
Literal
['encoder'
,'decoder'
,'none'
,'both'
] (default:'both'
)) – Whether to use batch norm in layers.use_layer_norm (
Literal
['encoder'
,'decoder'
,'none'
,'both'
] (default:'none'
)) – Whether to use layer norm in layers.extra_encoder_kwargs (
dict
|None
(default:None
)) – Extra keyword arguments passed intoEncoderTOTALVI
.extra_decoder_kwargs (
dict
|None
(default:None
)) – Extra keyword arguments passed intoDecoderTOTALVI
.
Attributes table#
Methods table#
|
Run the generative step. |
|
Compute reconstruction loss. |
|
Returns the tensors of dispersions for genes and proteins. |
|
Returns the reconstruction loss and the Kullback divergences. |
|
Computes the marginal log likelihood of the data under the model. |
|
Callback function run in |
|
Sample from the generative model. |
Attributes#
- TOTALVAE.training: bool#
Methods#
- TOTALVAE.generative(z, library_gene, batch_index, label, cont_covs=None, cat_covs=None, size_factor=None, transform_batch=None)[source]#
Run the generative step.
- TOTALVAE.get_reconstruction_loss(x, y, px_dict, py_dict, pro_batch_mask_minibatch=None)[source]#
Compute reconstruction loss.
- Return type:
tuple
[Tensor
,Tensor
]
- TOTALVAE.get_sample_dispersion(x, y, batch_index=None, label=None, n_samples=1)[source]#
Returns the tensors of dispersions for genes and proteins.
- Parameters:
x (
Tensor
) – tensor of values with shape(batch_size, n_input_genes)
y (
Tensor
) – tensor of values with shape(batch_size, n_input_proteins)
batch_index (
Tensor
|None
(default:None
)) – array that indicates which batch the cells belong to with shapebatch_size
label (
Tensor
|None
(default:None
)) – tensor of cell-types labels with shape(batch_size, n_labels)
n_samples (
int
(default:1
)) – number of samples
- Return type:
tuple
[Tensor
,Tensor
]- Returns:
type tensors of dispersions of the negative binomial distribution
- TOTALVAE.loss(tensors, inference_outputs, generative_outputs, pro_recons_weight=1.0, kl_weight=1.0)[source]#
Returns the reconstruction loss and the Kullback divergences.
- Parameters:
x – tensor of values with shape
(batch_size, n_input_genes)
y – tensor of values with shape
(batch_size, n_input_proteins)
batch_index – array that indicates which batch the cells belong to with shape
batch_size
label – tensor of cell-types labels with shape (batch_size, n_labels)
- Return type:
tuple
[FloatTensor
,FloatTensor
,FloatTensor
,FloatTensor
]- Returns:
type the reconstruction loss and the Kullback divergences