# DestVI

**DestVI** [^ref1] (Deconvolution of Spatial Transcriptomics profiles using Variational Inference)
posits a conditional generative model of spatial transcriptomics down to the sub-cell-type variation level which
can be used to explore the spatial organization of a tissue and understanding gene expression variation between tissues and conditions.

The advantages of DestVI are:

-   Can stratify cells into discrete cell types and model continuous sub-cell-type variation.
-   Scalable to very large datasets (>1 million cells).

The limitations of DestVI include:

-   Effectively requires a GPU for fast inference.

```{topic} Tutorial:

-   {doc}`/tutorials/notebooks/spatial/DestVI_tutorial`
```

## Preliminaries

DestVI requires training two models, the scLVM (single-cell latent variable model) and the
stLVM (spatial transcriptomic latent variable model). The scLVM takes in as input a scRNA-seq gene
expression matrix of UMI counts $X$ with $N$ cells and $G$ genes, along with
a vector of cell type labels $\vec{c}$. Subsequently, the stLVM takes in the trained scLVM,
along a spatial gene expression matrix $Y$ with $S$ spots and $G$ genes.
Optionally, the user can specify the number of components used for the mixture model underlying the
empirical prior.

## Generative process

### scLVM

For cell $n$, the scLVM assumes observed discrete cell type labels $c_n$ and models
continuous covariates $\gamma_n$ of dimension $d$ to explain variation in gene expression within a cell type.
The scLVM posits that the observed UMI counts for cell $n$ are generated by the following process:

```{math}
:nowrap: true

\begin{align}
    \gamma_n &\sim \textrm{Normal}(0, I) \tag{1} \\
    x_{ng} &\sim \textrm{NegativeBinomial}(l_nf^g(c_n, \gamma_n), p_g) \tag{2} \\
\end{align}
```

where $l_n$ is the library size, $f$ is a two-layer neural network which outputs a $G$
dimensional vector, and $p_g$ is the rate parameter of the negative binomial distribution for
a given gene $g$.

:::{note}
We are using the standard rate-shape parametrization of the negative binomial here, rather than the mean-dispersion
parametrization used in {doc}`/user_guide/models/scvi`. This is to take advantage of the additive property of
negative binomial distributions sharing the same shape parameter. In this case, the rate parameter for the
negative binomial modeling the expression counts for a given gene and spot is equivalent to the sum of the rate
parameters for each contributing cell.
:::

This generative process is also summarized in the following graphical model:

:::{figure} figures/scLVM_graphical_model.svg
:align: center
:alt: scLVM graphical model
:class: img-fluid

scLVM graphical model.
:::

The latent variables for the scLVM, along with their description are summarized in the following table:

```{eval-rst}
.. list-table::
   :widths: 20 90 15
   :header-rows: 1

   * - Latent variable
     - Description
     - Code variable (if different)
   * - :math:`\gamma_n \in \mathbb{R}^d`
     - Low-dimensional representation of sub-cell-type covariates.
     - ``z``
   * - :math:`p_g \in (0, \infty)`
     - Rate parameter for the negative binomial distribution.
     - ``px_r``
```

### stLVM

For the stLVM, we also model the expression counts with a $\mathrm{NegativeBinomial}$. However,
for spatial data, we assume that each spot $s$ has expression $x_s$ composed of a bulk of cell types, with
cell type abundance, $\beta_{sc}$, for each cell type $c$. We assume that for a given spot $s$
and gene $g$, the observation is generated as a function of the latent variables $(c, \gamma_s^c)$ by the following process:

```{math}
:nowrap: true

\begin{align}
    \gamma_x^c &\sim \sum_{k=1}^K m_{kc} q_\Phi(\gamma^c \mid u_{kc}, c) \tag{4} \\
    x_{sg} &\sim \mathrm{NegativeBinomial}(l_s\alpha_g\sum_{c=1}^{C}\beta_{sc}f^g(c, \gamma_s^c), p_g) \tag{5} \\
\end{align}
```

where $l_s$ is the library size and $\alpha_g$ is a correction term for
difference in experimental assays. Like the scLVM, $f$ is a decoder neural network, and
$p_g$ is the rate parameter for the negative binomial distribution.

To avoid the latent variable $\gamma_s^c$ from incorporating variation attributed to experimental
assay differences, we assign an empirical prior informed by the scLVM and the corresponding
cells of the same cell type in the scRNA-seq dataset. To compute this function, we subcluster the latent space of the
scLVM for each cell type to K cell type specific clusters. For each cluster we compute an empirical mean and variance.
Above, $\{u_{kc}\}_{k=1}^K$ designates the set of cell type specific subclusters from cell type $c$ in the scRNA-seq dataset, and
$q_\Phi$ designates the empirical normal distribution from the computed cluster mean and variance.
The loss is weighted by the probability of a random cell from this cell type to be in the respective cluster in the
scRNA-seq dataset (mixture probability, $m_{kc}$).
In literature, the prior is referred to as a VampPrior ("variational aggregated mixture of posteriors" prior) [^ref2].
More can be read on this prior in the DestVI paper.

Lastly, an additional latent variable, $\eta_g$, is incorporated into the aggregated cell expression profile
as a dummy cell type to represent gene specific noise. The dummy cell type's expression profile is distributed
as $\epsilon_g := \mathrm{Softplus}(\eta_g)$ where $\eta_g \sim \mathrm{Normal}(0, 1)$.
Like the other cell types, there is an associated cell type abundance parameter $\beta_{sc}$ associated with $\eta$.
We suspect each spot to only contain a fraction of the different cell types. To increase sparsity of the cell type
proportions, the stLVM supports L1 regularization on the cell types proportions $\beta_{sc}$. By default this loss is
not used.

This generative process is also summarized in the following graphical model:

:::{figure} figures/stLVM_graphical_model.svg
:align: center
:alt: stLVM graphical model
:class: img-fluid

stLVM graphical model.
:::

The latent variables for the stLVM, along with their description are summarized in the following table:

```{eval-rst}
.. list-table::
   :widths: 20 90 15
   :header-rows: 1

   * - Latent variable
     - Description
     - Code variable (if different)
   * - :math:`\beta_{sc} \in (0, \infty)`
     - Spot-specific cell type abundance.
     - ``v_ind``
   * - :math:`\gamma_s^c \in (-\infty, \infty)`
     - Low-dimensional representation of sub-cell-type covariates for a given spot and cell type.
     - ``gamma``
   * - :math:`\eta_g \in (0, \infty)`
     - Gene-specific noise.
     - ``eta``
   * - :math:`\alpha_g \in (0, \infty)`
     - Correction term for technological differences.
     - ``beta``
   * - :math:`p_g \in (0,\infty)`
     - Rate parameter for the negative binomial distribution.
     - ``px_o``

```

## Inference

### scLVM

DestVI uses variational inference and specifically auto-encoding variational bayes (see {doc}`/user_guide/background/variational_inference`)
to learn both the model parameters (the neural network params, rate params, etc.) and an approximate posterior distribution
for the scLVM. Like {class}`scvi.model.SCVI`, the underlying class used as the encoder for DestVI is {class}`~scvi.nn.Encoder`.

### stLVM

For the stLVM, DestVI infers point estimates for latent variables $\gamma^c, \alpha, \beta$ using a penalized
likelihood method. Beyond vanilla MAP inference, to regularize $\alpha$ a variance penalty is applied across all genes.
Additionally, rather than having just $C$ parameters per spot to denote the estimated cell type abundances per spot, the stLVM
has $dC$ parameters per spot as well to account for the latent space learned by the scLVM.

The loss is defined as:

```{math}
:nowrap: true

\begin{align}
     L(l, \alpha, \beta, f^g, \gamma, p, \eta) := &-\log p(X \mid l, \alpha, \beta, f^g, \gamma, p, \eta) - \lambda_{\eta} \log p(\eta) \\
     &+ \lambda_{\alpha} \mathrm{Var}(\alpha) - \log p(\gamma \mid \mathrm{VampPrior}) + \lambda_{\beta} \lVert \beta_{sc} \rVert_1  \tag{6} \\
\end{align}
```

where $\mathrm{Var}(\alpha)$ refers to the empirical variance of the parameters alpha across all genes. We used this as a practical form of regularization (a similar regularizer is used in the ZINB-WaVE model [^ref3]).

$\lambda_{\beta}$ (`l1_reg` in code), $\lambda_{\eta}$ (`eta_reg` in code) and $\lambda_{\alpha}$ (`beta_reg` in code) are hyperparameters used to scale the loss term. Increasing $\lambda_{\beta}$ leads to increased sparsity of cell type proportions. Increasing $\lambda_{\alpha}$ leads to less model flexibility for technical variation between single cell and spatial sequencing dataset. Increasing $\lambda_{\eta}$ leads to more genes being explained by the dummy cell type (we recommend to not change the default value).
To avoid overfitting, DestVI amortizes inference using a neural network to parametrize the latent variables.
Via the `amortization` parameter of {class}`scvi.module.MRDeconv`, the user can specify which of
$\beta$ and $\gamma^c$ will be parametrized by the neural network.

## Tasks

### Cell type deconvolution

Once the model is trained, one can retrieve the estimated cell type proportions in each spot using the method:

```
>>> proportions = st_model.get_proportions()
>>> st_adata.obsm["proportions"] = proportions
```

These proportions are computed by normalizing across all learned cell type abundances, $\beta_{sc}$, for a given spot $s$.
I.e. the estimated proportion of cell type $c$ for spot $s$ is $\frac{\beta_{sc}}{\sum_c \beta_{sc}}$.

Subsequently for a given cell type, users can plot a heatmap of the cell type proportions spatially using scanpy with:

```
>>> import scanpy as sc
>>> st_adata.obs['B cells'] = st_adata.obsm['proportions']['B cells']
>>> sc.pl.spatial(st_adata, color="B cells", spot_size=130)
```

### Intra cell type variation

Users can retrieve the values of $\gamma$, the latent variables corresponding to the
modeled cell-type-specific continuous covariates with:

```
>>> gamma = st_model.get_gamma()["B cells"]
>>> st_adata.obsm["B_cells_gamma"] = gamma
```

### Cell-type-specific gene expression imputation

Assuming the user has identified key gene modules that vary within a cell type of interest, they can
impute the spatial pattern of the cell-type-specific gene expression with:

```
>>> # Filter spots with low abundance.
>>> indices = np.where(st_adata.obsm["proportions"][ct_name].values > 0.03)[0]
>>> imputed_counts = st_model.get_scale_for_ct("Monocyte", indices=indices)[["Cxcl9", "Cxcl10", "Fcgr1"]]
```

### Comparative analysis between samples

To perform differential expression across samples, one can apply a frequentist test by taking samples
from the parameters of the generative distribution predicted for each spot in question.

### Utilities function

To explore the results of the output of the stLVM, we published a utilities function covering functions
for automatic thresholding of cell type proportions, a spatial PCA analysis to find main axis of variation
in spatial gene expression and the described frequentist test for differential expression. Further information
can be found on [destvi_utils](https://destvi-utils.readthedocs.io/en/latest/installation.html)

[^ref1]: Romain Lopez, Baoguo Li, Hadas Keren-Shaul, Pierre Boyeau, Merav Kedmi, David Pilzer, Adam Jelinski, Ido Yofe, Eyal David, Allon Wagner, Can Ergen, Yoseph Addadi, Ofra Golani, Franca Ronchese, Michael I Jordan, Ido Amit, Nir Yosef (2022). _DestVI identifies continuums of cell types in spatial transcriptomics data._ [Nature Biotechnology (in press)](https://www.biorxiv.org/content/10.1101/2021.05.10.443517v1)
[^ref2]: Jakub Tomczak, Max Welling (2018),_VAE with a VampPrior_, [Proceedings of Machine Learning Research](https://proceedings.mlr.press/v84/tomczak18a.html)
[^ref3]: Davide Risso, Fanny Perraudeau, Svetlana Gribkova, Sandrine Dudoit, Jean-Philippe Vert (2018). _A general and flexible method for signal extraction from single-cell RNA-seq data_, [Nature Communications] (https://www.nature.com/articles/s41467-017-02554-5)