scvi.model.base.VAEMixin#
Methods table#
|
Compute the differential abundance between samples. |
|
Compute the aggregated posterior over the |
|
Compute the evidence lower bound (ELBO) on the data. |
|
Compute the latent representation of the data. |
|
Compute the marginal log-likehood of the data. |
|
Compute the reconstruction error on the data. |
Methods#
- VAEMixin.differential_abundance(adata=None, sample_key=None, batch_size=128, num_cells_posterior=None, dof=None)[source]#
Compute the differential abundance between samples.
Computes the log probabilities of each sample conditioned on the estimated aggregate posterior distribution of each cell.
- Parameters:
adata (
AnnData|MuData|None(default:None)) – The data object to compute the differential abundance for. For very large datasets, this should be a subset of the original data object.sample_key (
str|None(default:None)) – Key for the sample covariate.batch_size (
int(default:128)) – Minibatch size for computing the differential abundance.num_cells_posterior (
int|None(default:None)) – Maximum number of cells used to compute aggregated posterior for each sample.dof (
float|None(default:None)) – Degrees of freedom for the Student’s t-distribution components for aggregated posterior. IfNone, components are Normal.
- VAEMixin.get_aggregated_posterior(adata=None, indices=None, batch_size=None, dof=3.0)[source]#
Compute the aggregated posterior over the
ulatent representations.- Parameters:
adata (default:
None) – AnnData object to use. Defaults to the AnnData object used to initialize the model.indices (default:
None) – Indices of cells to use.batch_size (default:
None) – Batch size to use for computing the latent representation.dof (default:
3.0) – Degrees of freedom for the Student’s t-distribution components. IfNone, components are Normal.
- Returns:
A mixture distribution of the aggregated posterior.
- VAEMixin.get_elbo(adata=None, indices=None, batch_size=None, dataloader=None, return_mean=True, data_loader_kwargs=None, **kwargs)[source]#
Compute the evidence lower bound (ELBO) on the data.
The ELBO is the reconstruction error plus the Kullback-Leibler (KL) divergences between the variational distributions and the priors. It is different from the marginal log-likelihood; specifically, it is a lower bound on the marginal log-likelihood plus a term that is constant with respect to the variational distribution. It still gives good insights on the modeling of the data and is fast to compute.
- Parameters:
adata (
AnnData|None(default:None)) –AnnDataobject withvar_namesin the same order as the ones used to train the model. IfNoneanddataloaderis alsoNone, it defaults to the object used to initialize the model.indices (
Sequence[int] |None(default:None)) – Indices of observations inadatato use. IfNone, defaults to all observations. Ignored ifdataloaderis notNone.batch_size (
int|None(default:None)) – Minibatch size for the forward pass. IfNone, defaults toscvi.settings.batch_size. Ignored ifdataloaderis notNone.dataloader (
Iterator[dict[str,Tensor|None]] |None(default:None)) – An iterator over minibatches of data on which to compute the metric. The minibatches should be formatted as a dictionary ofTensorwith keys as expected by the model. IfNone, a dataloader is created fromadata.return_mean (
bool(default:True)) – Whether to return the mean of the ELBO or the ELBO for each observation.data_loader_kwargs (
dict|None(default:None)) – Keyword args for data loader, in dict form.**kwargs – Additional keyword arguments to pass into the forward method of the module.
- Return type:
- Returns:
Evidence lower bound (ELBO) of the data.
Notes
This is not the negative ELBO, so higher is better.
- VAEMixin.get_latent_representation(adata=None, indices=None, give_mean=True, mc_samples=5000, batch_size=None, return_dist=False, dataloader=None, **data_loader_kwargs)[source]#
Compute the latent representation of the data.
This is typically denoted as \(z_n\).
- Parameters:
adata (
AnnData|None(default:None)) –AnnDataobject withvar_namesin the same order as the ones used to train the model. IfNoneanddataloaderis alsoNone, it defaults to the object used to initialize the model.indices (
Sequence[int] |None(default:None)) – Indices of observations inadatato use. IfNone, defaults to all observations. Ignored ifdataloaderis notNonegive_mean (
bool(default:True)) – IfTrue, returns the mean of the latent distribution. IfFalse, returns an estimate of the mean usingmc_samplesMonte Carlo samples.mc_samples (
int(default:5000)) – Number of Monte Carlo samples to use for the estimator for distributions with no closed-form mean (e.g., the logistic normal distribution). Not used ifgive_meanisTrueor ifreturn_distisTrue.batch_size (
int|None(default:None)) – Minibatch size for the forward pass. IfNone, defaults toscvi.settings.batch_size. Ignored ifdataloaderis notNonereturn_dist (
bool(default:False)) – IfTrue, returns the mean and variance of the latent distribution. Otherwise, returns the mean of the latent distribution.dataloader (
Iterator[dict[str,Tensor|None]] (default:None)) – An iterator over minibatches of data on which to compute the metric. The minibatches should be formatted as a dictionary ofTensorwith keys as expected by the model. IfNone, a dataloader is created fromadata.**data_loader_kwargs – Keyword args for data loader.
- Return type:
ndarray[tuple[Any,...],dtype[TypeVar(_ScalarT, bound=generic)]] |tuple[ndarray[tuple[Any,...],dtype[TypeVar(_ScalarT, bound=generic)]],ndarray[tuple[Any,...],dtype[TypeVar(_ScalarT, bound=generic)]]]- Returns:
An array of shape
(n_obs, n_latent)ifreturn_distisFalse. Otherwise, returns a tuple of arrays(n_obs, n_latent)with the mean and variance of the latent distribution.
- VAEMixin.get_marginal_ll(adata=None, indices=None, n_mc_samples=1000, batch_size=None, return_mean=True, dataloader=None, data_loader_kwargs=None, **kwargs)[source]#
Compute the marginal log-likehood of the data.
The computation here is a biased estimator of the marginal log-likelihood of the data.
- Parameters:
adata (
AnnData|None(default:None)) –AnnDataobject withvar_namesin the same order as the ones used to train the model. IfNoneanddataloaderis alsoNone, it defaults to the object used to initialize the model.indices (
Sequence[int] |None(default:None)) – Indices of observations inadatato use. IfNone, defaults to all observations. Ignored ifdataloaderis notNone.n_mc_samples (
int(default:1000)) – Number of Monte Carlo samples to use for the estimator. Passed into the module’smarginal_llmethod.batch_size (
int|None(default:None)) – Minibatch size for the forward pass. IfNone, defaults toscvi.settings.batch_size. Ignored ifdataloaderis notNone.return_mean (
bool(default:True)) – Whether to return the mean of the marginal log-likelihood or the marginal-log likelihood for each observation.dataloader (
Iterator[dict[str,Tensor|None]] (default:None)) – An iterator over minibatches of data on which to compute the metric. The minibatches should be formatted as a dictionary ofTensorwith keys as expected by the model. IfNone, a dataloader is created fromadata.data_loader_kwargs (
dict|None(default:None)) – Keyword args for data loader, in dict form.**kwargs – Additional keyword arguments to pass into the module’s
marginal_llmethod.
- Return type:
- Returns:
If
True, returns the mean marginal log-likelihood. Otherwise returns a tensor of shape(n_obs,)with the marginal log-likelihood for each observation.
Notes
This is not the negative log-likelihood, so higher is better.
- VAEMixin.get_reconstruction_error(adata=None, indices=None, batch_size=None, dataloader=None, return_mean=True, data_loader_kwargs=None, **kwargs)[source]#
Compute the reconstruction error on the data.
The reconstruction error is the negative log likelihood of the data given the latent variables. It is different from the marginal log-likelihood, but still gives good insights on the modeling of the data and is fast to compute. This is typically written as \(p(x \mid z)\), the likelihood term given one posterior sample.
- Parameters:
adata (
AnnData|None(default:None)) –AnnDataobject withvar_namesin the same order as the ones used to train the model. IfNoneanddataloaderis alsoNone, it defaults to the object used to initialize the model.indices (
Sequence[int] |None(default:None)) – Indices of observations inadatato use. IfNone, defaults to all observations. Ignored ifdataloaderis notNonebatch_size (
int|None(default:None)) – Minibatch size for the forward pass. IfNone, defaults toscvi.settings.batch_size. Ignored ifdataloaderis notNonedataloader (
Iterator[dict[str,Tensor|None]] |None(default:None)) – An iterator over minibatches of data on which to compute the metric. The minibatches should be formatted as a dictionary ofTensorwith keys as expected by the model. IfNone, a dataloader is created fromadata.return_mean (
bool(default:True)) – Whether to return the mean reconstruction loss or the reconstruction loss for each observation.data_loader_kwargs (
dict|None(default:None)) – Keyword args for data loader, in dict form.**kwargs – Additional keyword arguments to pass into the forward method of the module.
- Return type:
- Returns:
Reconstruction error for the data.
Notes
This is not the negative reconstruction error, so higher is better.