scvi.model.base.VAEMixin#
Methods table#
|
Compute the evidence lower bound (ELBO) on the data. |
|
Compute the latent representation of the data. |
|
Compute the marginal log-likehood of the data. |
|
Compute the reconstruction error on the data. |
Methods#
- VAEMixin.get_elbo(adata=None, indices=None, batch_size=None, dataloader=None, return_mean=True, **kwargs)[source]#
Compute the evidence lower bound (ELBO) on the data.
The ELBO is the reconstruction error plus the Kullback-Leibler (KL) divergences between the variational distributions and the priors. It is different from the marginal log-likelihood; specifically, it is a lower bound on the marginal log-likelihood plus a term that is constant with respect to the variational distribution. It still gives good insights on the modeling of the data and is fast to compute.
- Parameters:
adata (
AnnData
|None
(default:None
)) –AnnData
object withvar_names
in the same order as the ones used to train the model. IfNone
anddataloader
is alsoNone
, it defaults to the object used to initialize the model.indices (
Sequence
[int
] |None
(default:None
)) – Indices of observations inadata
to use. IfNone
, defaults to all observations. Ignored ifdataloader
is notNone
.batch_size (
int
|None
(default:None
)) – Minibatch size for the forward pass. IfNone
, defaults toscvi.settings.batch_size
. Ignored ifdataloader
is notNone
.dataloader (
Iterator
[dict
[str
,Tensor
|None
]] (default:None
)) – An iterator over minibatches of data on which to compute the metric. The minibatches should be formatted as a dictionary ofTensor
with keys as expected by the model. IfNone
, a dataloader is created fromadata
.return_mean (
bool
(default:True
)) – Whether to return the mean of the ELBO or the ELBO for each observation.**kwargs – Additional keyword arguments to pass into the forward method of the module.
- Return type:
- Returns:
Evidence lower bound (ELBO) of the data.
Notes
This is not the negative ELBO, so higher is better.
- VAEMixin.get_latent_representation(adata=None, indices=None, give_mean=True, mc_samples=5000, batch_size=None, return_dist=False, dataloader=None)[source]#
Compute the latent representation of the data.
This is typically denoted as \(z_n\).
- Parameters:
adata (
AnnData
|None
(default:None
)) –AnnData
object withvar_names
in the same order as the ones used to train the model. IfNone
anddataloader
is alsoNone
, it defaults to the object used to initialize the model.indices (
Sequence
[int
] |None
(default:None
)) – Indices of observations inadata
to use. IfNone
, defaults to all observations. Ignored ifdataloader
is notNone
give_mean (
bool
(default:True
)) – IfTrue
, returns the mean of the latent distribution. IfFalse
, returns an estimate of the mean usingmc_samples
Monte Carlo samples.mc_samples (
int
(default:5000
)) – Number of Monte Carlo samples to use for the estimator for distributions with no closed-form mean (e.g., the logistic normal distribution). Not used ifgive_mean
isTrue
or ifreturn_dist
isTrue
.batch_size (
int
|None
(default:None
)) – Minibatch size for the forward pass. IfNone
, defaults toscvi.settings.batch_size
. Ignored ifdataloader
is notNone
return_dist (
bool
(default:False
)) – IfTrue
, returns the mean and variance of the latent distribution. Otherwise, returns the mean of the latent distribution.dataloader (
Iterator
[dict
[str
,Tensor
|None
]] (default:None
)) – An iterator over minibatches of data on which to compute the metric. The minibatches should be formatted as a dictionary ofTensor
with keys as expected by the model. IfNone
, a dataloader is created fromadata
.
- Return type:
ndarray
[Any
,dtype
[TypeVar
(_ScalarType_co
, bound=generic
, covariant=True)]] |tuple
[ndarray
[Any
,dtype
[TypeVar
(_ScalarType_co
, bound=generic
, covariant=True)]],ndarray
[Any
,dtype
[TypeVar
(_ScalarType_co
, bound=generic
, covariant=True)]]]- Returns:
An array of shape
(n_obs, n_latent)
ifreturn_dist
isFalse
. Otherwise, returns a tuple of arrays(n_obs, n_latent)
with the mean and variance of the latent distribution.
- VAEMixin.get_marginal_ll(adata=None, indices=None, n_mc_samples=1000, batch_size=None, return_mean=True, dataloader=None, **kwargs)[source]#
Compute the marginal log-likehood of the data.
The computation here is a biased estimator of the marginal log-likelihood of the data.
- Parameters:
adata (
AnnData
|None
(default:None
)) –AnnData
object withvar_names
in the same order as the ones used to train the model. IfNone
anddataloader
is alsoNone
, it defaults to the object used to initialize the model.indices (
Sequence
[int
] |None
(default:None
)) – Indices of observations inadata
to use. IfNone
, defaults to all observations. Ignored ifdataloader
is notNone
.n_mc_samples (
int
(default:1000
)) – Number of Monte Carlo samples to use for the estimator. Passed into the module’smarginal_ll
method.batch_size (
int
|None
(default:None
)) – Minibatch size for the forward pass. IfNone
, defaults toscvi.settings.batch_size
. Ignored ifdataloader
is notNone
.return_mean (
bool
(default:True
)) – Whether to return the mean of the marginal log-likelihood or the marginal-log likelihood for each observation.dataloader (
Iterator
[dict
[str
,Tensor
|None
]] (default:None
)) – An iterator over minibatches of data on which to compute the metric. The minibatches should be formatted as a dictionary ofTensor
with keys as expected by the model. IfNone
, a dataloader is created fromadata
.**kwargs – Additional keyword arguments to pass into the module’s
marginal_ll
method.
- Return type:
float
|Tensor
- Returns:
If
True
, returns the mean marginal log-likelihood. Otherwise returns a tensor of shape(n_obs,)
with the marginal log-likelihood for each observation.
Notes
This is not the negative log-likelihood, so higher is better.
- VAEMixin.get_reconstruction_error(adata=None, indices=None, batch_size=None, dataloader=None, return_mean=True, **kwargs)[source]#
Compute the reconstruction error on the data.
The reconstruction error is the negative log likelihood of the data given the latent variables. It is different from the marginal log-likelihood, but still gives good insights on the modeling of the data and is fast to compute. This is typically written as \(p(x \mid z)\), the likelihood term given one posterior sample.
- Parameters:
adata (
AnnData
|None
(default:None
)) –AnnData
object withvar_names
in the same order as the ones used to train the model. IfNone
anddataloader
is alsoNone
, it defaults to the object used to initialize the model.indices (
Sequence
[int
] |None
(default:None
)) – Indices of observations inadata
to use. IfNone
, defaults to all observations. Ignored ifdataloader
is notNone
batch_size (
int
|None
(default:None
)) – Minibatch size for the forward pass. IfNone
, defaults toscvi.settings.batch_size
. Ignored ifdataloader
is notNone
dataloader (
Iterator
[dict
[str
,Tensor
|None
]] (default:None
)) – An iterator over minibatches of data on which to compute the metric. The minibatches should be formatted as a dictionary ofTensor
with keys as expected by the model. IfNone
, a dataloader is created fromadata
.return_mean (
bool
(default:True
)) – Whether to return the mean reconstruction loss or the reconstruction loss for each observation.**kwargs – Additional keyword arguments to pass into the forward method of the module.
- Return type:
- Returns:
Reconstruction error for the data.
Notes
This is not the negative reconstruction error, so higher is better.