TOTALVI¶
-
class
scvi.models.
TOTALVI
(n_input_genes, n_input_proteins, n_batch=0, n_labels=0, n_hidden=256, n_latent=20, n_layers_encoder=1, n_layers_decoder=1, dropout_rate_decoder=0.2, dropout_rate_encoder=0.2, gene_dispersion='gene', protein_dispersion='protein', log_variational=True, reconstruction_loss_gene='nb', latent_distribution='ln', protein_batch_mask=None, encoder_batch=True)[source]¶ Bases:
torch.nn.modules.module.Module
Total variational inference for CITE-seq data
Implements the totalVI model of [GayosoSteier20].
- Parameters
n_hidden (
int
int
) – Number of nodes per hidden layer for the z encoder (protein+genes), genes library encoder, z->genes+proteins decodern_layers – Number of hidden layers used for encoder and decoder NNs
dropout_rate – Dropout rate for neural networks
genes_dispersion –
One of the following
'gene'
- genes_dispersion parameter of NB is constant per gene across cells'gene-batch'
- genes_dispersion can differ between different batches'gene-label'
- genes_dispersion can differ between different labels
One of the following
'protein'
- protein_dispersion parameter is constant per protein across cells'protein-batch'
- protein_dispersion can differ between different batches NOT TESTED'protein-label'
- protein_dispersion can differ between different labels NOT TESTED
log_variational (
bool
bool
) – Log(data+1) prior to encoding for numerical stability. Not normalization.reconstruction_loss_genes –
One of
'nb'
- Negative binomial distribution'zinb'
- Zero-inflated negative binomial distribution
latent_distribution (
str
str
) –One of
'normal'
- Isotropic normal'ln'
- Logistic normal with normal params N(0, 1)
Examples:
- Returns
>>> dataset = Dataset10X(dataset_name="pbmc_10k_protein_v3", save_path=save_path) >>> totalvae = TOTALVI(gene_dataset.nb_genes, len(dataset.protein_names), use_cuda=True)
Methods Summary
forward
(x, y, local_l_mean_gene, …[, …])Returns the reconstruction loss and the Kullback divergences
get_reconstruction_loss
(x, y, px_, py_[, …])Compute reconstruction loss
get_sample_dispersion
(x, y[, batch_index, …])Returns the tensors of dispersions for genes and proteins
get_sample_rate
(x, y[, batch_index, label, …])Returns the tensor of negative binomial mean for genes
get_sample_scale
(x, y[, batch_index, label, …])Returns tuple of gene and protein scales.
inference
(x, y[, batch_index, label, …])Internal helper function to compute necessary inference quantities
sample_from_posterior_l
(x, y[, batch_index, …])Provides the tensor of library size from the posterior
sample_from_posterior_z
(x, y[, batch_index, …])Access the tensor of latent values from the posterior
Methods Documentation
-
forward
(x, y, local_l_mean_gene, local_l_var_gene, batch_index=None, label=None)[source]¶ Returns the reconstruction loss and the Kullback divergences
- Parameters
x (
Tensor
Tensor
) – tensor of values with shape(batch_size, n_input_genes)
y (
Tensor
Tensor
) – tensor of values with shape(batch_size, n_input_proteins)
local_l_mean_gene (
Tensor
Tensor
) – tensor of means of the prior distribution of latent variable l with shape(batch_size, 1)``
local_l_var_gene (
Tensor
Tensor
) – tensor of variancess of the prior distribution of latent variable l with shape(batch_size, 1)
batch_index (
Tensor
,None
Optional
[Tensor
]) – array that indicates which batch the cells belong to with shapebatch_size
label (
Tensor
,None
Optional
[Tensor
]) – tensor of cell-types labels with shape (batch_size, n_labels)
- Return type
Tuple
[FloatTensor
,FloatTensor
,FloatTensor
,FloatTensor
]Tuple
[FloatTensor
,FloatTensor
,FloatTensor
,FloatTensor
]- Returns
type the reconstruction loss and the Kullback divergences
-
get_reconstruction_loss
(x, y, px_, py_, pro_batch_mask_minibatch=None)[source]¶ Compute reconstruction loss
-
get_sample_dispersion
(x, y, batch_index=None, label=None, n_samples=1)[source]¶ Returns the tensors of dispersions for genes and proteins
- Parameters
x (
Tensor
Tensor
) – tensor of values with shape(batch_size, n_input_genes)
y (
Tensor
Tensor
) – tensor of values with shape(batch_size, n_input_proteins)
batch_index (
Tensor
,None
Optional
[Tensor
]) – array that indicates which batch the cells belong to with shapebatch_size
label (
Tensor
,None
Optional
[Tensor
]) – tensor of cell-types labels with shape(batch_size, n_labels)
- Return type
- Returns
type tensors of dispersions of the negative binomial distribution
-
get_sample_rate
(x, y, batch_index=None, label=None, n_samples=1)[source]¶ Returns the tensor of negative binomial mean for genes
- Parameters
x (
Tensor
Tensor
) – tensor of values with shape(batch_size, n_input_genes)
y (
Tensor
Tensor
) – tensor of values with shape(batch_size, n_input_proteins)
batch_index (
Tensor
,None
Optional
[Tensor
]) – array that indicates which batch the cells belong to with shapebatch_size
label (
Tensor
,None
Optional
[Tensor
]) – tensor of cell-types labels with shape(batch_size, n_labels)
- Return type
- Returns
type tensor of means of the negative binomial distribution with shape
(batch_size, n_input_genes)
-
get_sample_scale
(x, y, batch_index=None, label=None, n_samples=1, transform_batch=None, eps=0, normalize_pro=False, sample_bern=True, include_bg=False)[source]¶ Returns tuple of gene and protein scales.
These scales can also be transformed into a particular batch. This function is the core of differential expression.
- Parameters
transform_batch (
int
,None
Optional
[int
]) – Int of batch to “transform” all cells intoeps – Prior count to add to protein normalized expression (Default value = 0)
normalize_pro – bool, whether to make protein expression sum to one in a cell (Default value = False)
include_bg – bool, whether to include the background component of expression (Default value = False)
- Return type
- Returns
-
inference
(x, y, batch_index=None, label=None, n_samples=1, transform_batch=None)[source]¶ Internal helper function to compute necessary inference quantities
We use the dictionary
px_
to contain the parameters of the ZINB/NB for genes. The rate refers to the mean of the NB, dropout refers to Bernoulli mixing parameters. scale refers to the quanity upon which differential expression is performed. For genes, this can be viewed as the mean of the underlying gamma distribution.We use the dictionary
py_
to contain the parameters of the Mixture NB distribution for proteins. rate_fore refers to foreground mean, while rate_back refers to background mean.scale
refers to foreground mean adjusted for background probability and scaled to reside in simplex.back_alpha
andback_beta
are the posterior parameters forrate_back
.fore_scale
is the scaling factor that enforces rate_fore > rate_back.px_["r"]
andpy_["r"]
are the inverse dispersion parameters for genes and protein, respectively.
-
sample_from_posterior_l
(x, y, batch_index=None, give_mean=True)[source]¶ Provides the tensor of library size from the posterior