scvi.model.MULTIVI

class scvi.model.MULTIVI(adata, n_genes, n_regions, n_hidden=None, n_latent=None, n_layers_encoder=2, n_layers_decoder=2, dropout_rate=0.1, region_factors=True, gene_likelihood='zinb', use_batch_norm='none', use_layer_norm='both', latent_distribution='normal', deeply_inject_covariates=False, encode_covariates=False, fully_paired=False, **model_kwargs)[source]

Integration of multi-modal and single-modality data [AshuachGabitto21].

MultiVI is used to integrate multiomic datasets with single-modality (expression or accessibility) datasets.

Parameters
adata : AnnDataAnnData

AnnData object that has been registered via setup_anndata().

n_genes : intint

The number of gene expression features (genes).

n_regions : intint

The number of accessibility features (genomic regions).

n_hidden : int | NoneOptional[int] (default: None)

Number of nodes per hidden layer. If None, defaults to square root of number of regions.

n_latent : int | NoneOptional[int] (default: None)

Dimensionality of the latent space. If None, defaults to square root of n_hidden.

n_layers_encoder : intint (default: 2)

Number of hidden layers used for encoder NNs.

n_layers_decoder : intint (default: 2)

Number of hidden layers used for decoder NNs.

dropout_rate : floatfloat (default: 0.1)

Dropout rate for neural networks.

model_depth

Model sequencing depth / library size.

region_factors : boolbool (default: True)

Include region-specific factors in the model.

latent_distribution : {‘normal’, ‘ln’}Literal[‘normal’, ‘ln’] (default: 'normal')

One of * 'normal' - Normal distribution * 'ln' - Logistic normal distribution (Normal(0, I) transformed by softmax)

deeply_inject_covariates : boolbool (default: False)

Whether to deeply inject covariates into all layers of the decoder. If False, covariates will only be included in the input layer.

fully_paired : boolbool (default: False)

allows the simplification of the model if the data is fully paired. Currently ignored.

**model_kwargs

Keyword args for PEAKVAE

Examples

>>> adata_rna = anndata.read_h5ad(path_to_rna_anndata)
>>> adata_atac = scvi.data.read_10x_atac(path_to_atac_anndata)
>>> adata_multi = scvi.data.read_10x_multiome(path_to_multiomic_anndata)
>>> adata_mvi = scvi.data.organize_multiome_anndatas(adata_multi, adata_rna, adata_atac)
>>> scvi.data.setup_anndata(adata_mvi, batch_key="modality")
>>> vae = scvi.model.MULTIVI(adata_mvi)
>>> vae.train()

Notes

  • The model assumes that the features are organized so that all expression features are

    consecutive, followed by all accessibility features. For example, if the data has 100 genes and 250 genomic regions, the model assumes that the first 100 features are genes, and the next 250 are the regions.

  • The main batch annotation, specified in the scvi.data.setup_anndata, should correspond to

    the modality each cell originated from. This allows the model to focus mixing efforts, using an adversarial component, on mixing the modalities. Other covariates can be specified using the categorical_covariate_keys argument.

Attributes

device

history

Returns computed metrics during training.

is_trained

test_indices

train_indices

validation_indices

Methods

differential_accessibility([adata, groupby, …])

A unified method for differential accessibility analysis.

differential_expression([adata, groupby, …])

A unified method for differential expression analysis.

get_accessibility_estimates([adata, …])

rtype

ndarray | csr_matrixUnion[ndarray, csr_matrix]

get_elbo([adata, indices, batch_size])

Return the ELBO for the data.

get_latent_representation([adata, modality, …])

Return the latent representation for each cell.

get_library_size_factors([adata, indices, …])

get_marginal_ll([adata, indices, …])

Return the marginal LL for the data.

get_normalized_expression([adata, indices, …])

Returns the normalized (decoded) gene expression.

get_reconstruction_error([adata, indices, …])

Return the reconstruction error for the data.

get_region_factors()

rtype

ndarrayndarray

load(dir_path[, adata, use_gpu])

Instantiate a model from the saved output.

save(dir_path[, overwrite, save_anndata])

Save the state of the model.

to_device(device)

Move model to device.

train([max_epochs, lr, use_gpu, train_size, …])

Trains the model using amortized variational inference.