scvi.model.TOTALVI.get_normalized_expression

TOTALVI.get_normalized_expression(adata=None, indices=None, transform_batch=None, gene_list=None, protein_list=None, library_size=1, n_samples=1, sample_protein_mixing=False, scale_protein=False, include_protein_background=False, batch_size=None, return_mean=True, return_numpy=None)[source]

Returns the normalized gene expression and protein expression.

This is denoted as \(\rho_n\) in the totalVI paper for genes, and TODO for proteins, \((1-\pi_{nt})\alpha_{nt}\beta_{nt}\).

Parameters
adata

AnnData object with equivalent structure to initial AnnData. If None, defaults to the AnnData object used to initialize the model.

indices

Indices of cells in adata to use. If None, all cells are used.

transform_batch : Sequence[Union[~Number, str]] | NoneOptional[Sequence[Union[~Number, str]]] (default: None)

Batch to condition on. If transform_batch is:

  • None, then real observed batch is used

  • int, then batch transform_batch is used

  • List[int], then average over batches in list

gene_list : Sequence[str] | NoneOptional[Sequence[str]] (default: None)

Return frequencies of expression for a subset of genes. This can save memory when working with large datasets and few genes are of interest.

protein_list : Sequence[str] | NoneOptional[Sequence[str]] (default: None)

Return protein expression for a subset of genes. This can save memory when working with large datasets and few genes are of interest.

library_size : float | {‘latent’} | NoneUnion[float, Literal[‘latent’], None] (default: 1)

Scale the expression frequencies to a common library size. This allows gene expression levels to be interpreted on a common scale of relevant magnitude.

n_samples : intint (default: 1)

Get sample scale from multiple samples.

sample_protein_mixing : boolbool (default: False)

Sample mixing bernoulli, setting background to zero

scale_protein : boolbool (default: False)

Make protein expression sum to 1

include_protein_background : boolbool (default: False)

Include background component for protein expression

batch_size : int | NoneOptional[int] (default: None)

Minibatch size for data loading into model. Defaults to scvi.settings.batch_size.

return_mean : boolbool (default: True)

Whether to return the mean of the samples.

return_numpy : bool | NoneOptional[bool] (default: None)

Return a np.ndarray instead of a pd.DataFrame. Includes gene names as columns. If either n_samples=1 or return_mean=True, defaults to False. Otherwise, it defaults to True.

Return type

Tuple[Union[ndarray, DataFrame], Union[ndarray, DataFrame]]Tuple[Union[ndarray, DataFrame], Union[ndarray, DataFrame]]

Returns

  • gene_normalized_expression - normalized expression for RNA

  • protein_normalized_expression - normalized expression for proteins

If n_samples > 1 and return_mean is False, then the shape is (samples, cells, genes). Otherwise, shape is (cells, genes). Return type is pd.DataFrame unless return_numpy is True.