Stereoscope#

Stereoscope [1] posits a probabilistic model of spatial transcriptomics and an associated method for the deconvoluton of cell type profiles using a single-cell RNA sequencing reference dataset.

The advantages of Stereoscope are:

  • Can stratify cells into discrete cell types.

  • Scalable to very large datasets (>1 million cells).

The limitations of Stereoscope include:

  • Effectively requires a GPU for fast inference.

Preliminaries#

Stereoscope requires training two latent variable models (LVMs): one for the single-cell reference dataset and one for the spatial transcriptomics dataset, which incorporates the learned parameters of the single-cell reference LVM. The first LVM takes in as input a scRNA-seq gene expression matrix of UMI counts \(Y\) with \(N\) cells and \(G\) genes, along with a vector of cell type labels \(\vec{z}\). Subsequently, the second LVM takes in the learned parameters of the first LVM, along with a spatial gene expression matrix \(X\) with \(S\) spots and \(G\) genes.

Generative process#

Single-cell reference LVM#

For cell \(c\), the LVM assumes an observed discrete cell type label \(z_c\) and models the UMI count observation for a given gene \(g\) as a negative binomial distribution. This LVM posits that the observed UMI counts for cell \(c\) and gene \(g\) are generated by the following process:

\begin{align} y_{gc} &\sim \textrm{NegativeBinomial}(s_{c}r_{gz}, p_{g}) \tag{1} \\ \end{align}

where \(s_c = \sum_{g\in G} y_{gc}\) is the observed library size of the cell, \(r_{gz}\) is the latent rate parameter for the cell type \(z_c\) and gene \(g\), and \(p_g\) is the latent variable representing the success probability for gene \(g\).

Note

We are using the standard rate-shape parametrization of the negative binomial here, rather than the mean-dispersion parametrization used in scVI. This is to take advantage of the additive property of negative binomial distributions sharing the same shape parameter. In this case, the rate parameter for the negative binomial modeling the expression counts for a given gene and spot is equivalent to the sum of the rate parameters for each contributing cell.

This generative process is also summarized in the following graphical model:

single-cell reference LVM graphical model

single-cell reference LVM graphical model.#

The latent variables for the single-cell reference LVM, along with their description are summarized in the following table:

Latent variable

Description

Code variable (if different)

\(r_{gz} \in (0, \infty)\)

Rate parameter for the negative binomial distribution.

px_scale

\(p_g \in [0, 1]\)

Shape parameter for the negative binomial distribution.

px_o \(:= \log \left( \frac{p_g}{1 - p_g} \right)\)

Spatial transcriptomics LVM#

For the second LVM, we also model the expression counts with a \(\mathrm{NegativeBinomial}\). However, for spatial data, we assume that each spot \(s\) has expression \(x_s\) composed of a bulk of cell types, with cell type abundance, \(v_{sz}\), for each cell type \(z\). We assume that for a given spot \(s\) and gene \(g\), the observation is generated by the following process:

\begin{align} x_{sg} &\sim \mathrm{NegativeBinomial}(\beta_g\sum_{z\in Z}v_{sz}r_{gz}, p_g) \tag{2} \\ \end{align}

where \(\beta_g\) is a gene-specific correction term for technical differences. The parameters \(r_{gz}\) and \(p_g\) are the learned parameters from the first LVM.

An additional latent variable, \(\eta_g\), is incorporated into the aggregated cell expression profile as a dummy cell type to represent gene specific noise. The dummy cell type’s expression profile is distributed as \(\varepsilon_g := \mathrm{Softplus}(\eta_g)\) where \(\eta_g \sim \mathrm{Normal}(0, 1)\) to avoid the model from incorrectly assigning explanatory power to this term. Like the other cell types, there is an associated cell type abundance parameter \(\gamma_s\) associated with \(\varepsilon\).

This generative process is also summarized in the following graphical model:

spatial transcriptomics LVM graphical model

spatial transcriptomics LVM graphical model.#

The latent variables for the spatial transcriptomics LVM, along with their description are summarized in the following table:

Latent variable

Description

Code variable (if different)

\(v_{sz} \in (0, \infty)\)

Spot-specific cell type abundance. The code variable v_ind also incorporates the the abundance term, \(\gamma_s\) for the dummy noise cell type, \(\varepsilon\).

v_ind

\(\eta_g \in (-\infty, \infty)\)

Gene-specific noise. Incorporated into the model as \(\varepsilon_g := \mathrm{Softplus}(\eta_g)\).

eta

\(\beta_g \in (0, \infty)\)

Correction term for technological differences.

beta

\(r_{gz} \in (0, \infty)\)

Rate parameter for the negative binomial distribution shared from the single-cell reference LVM.

w

\(p_g \in [0,1]\)

Shape parameter for the negative binomial distribution shared from the single-cell reference LVM.

px_o \(:= \log \left( \frac{p_g}{1 - p_g} \right)\)

Inference#

Single-cell reference LVM#

Stereoscope uses maximum likelihood estimation to estimate the parameters of the first LVM w.r.t. the negative binomial model of UMI observations. This is achieved via stochastic gradient ascent on the likelihood function using the Pytorch framework.

Spatial transcriptomics LVM#

For the spatial transcriptomics LVM, Stereoscope uses MAP inference to estimate the parameters specific to the model. To be exact, the only parameter given a non-uniform prior is \(\eta_g\) which is posited as a gene-specific random effect distributed by a standard Normal prior. Note, the \(r_{gz}\) and \(p_g\) parameters not inferred in this step, but held fixed as the parameters shared by the single-cell reference LVM.

Tasks#

Cell type deconvolution#

Once the model is trained, one can retrieve the estimated cell type proportions in each spot using the method:

>>> proportions = spatial_model.get_proportions()
>>> st_adata.obsm["proportions"] = proportions

These proportions are computed by normalizing across all learned cell type abundances, \(v_{sz}\), for a given spot \(s\). I.e. the estimated proportion of cell type \(z\) for spot \(s\) is \(\frac{v_{sz}}{\sum_{z'} v_{sz'}}\).

Subsequently for a given cell type, users can plot a heatmap of the cell type proportions spatially using scanpy with:

>>> import scanpy as sc
>>> sc.p1.embedding(st_adata, basis="location", color="B cells")