scvi.model.base.ArchesMixin#

class scvi.model.base.ArchesMixin[source]#

Universal scArches implementation.

Methods table#

load_query_data(adata, reference_model[, ...])

Online update of a reference model with scArches algorithm [Lotfollahi et al., 2021].

prepare_query_anndata(adata, reference_model)

Prepare data for query integration.

prepare_query_mudata(mdata, reference_model)

Prepare multimodal dataset for query integration.

Methods#

classmethod ArchesMixin.load_query_data(adata, reference_model, inplace_subset_query_vars=False, accelerator='auto', device='auto', unfrozen=False, freeze_dropout=False, freeze_expression=True, freeze_decoder_first_layer=True, freeze_batchnorm_encoder=True, freeze_batchnorm_decoder=False, freeze_classifier=True)[source]#

Online update of a reference model with scArches algorithm [Lotfollahi et al., 2021].

Parameters:
  • adata (AnnData | MuData) – AnnData organized in the same way as data used to train model. It is not necessary to run setup_anndata, as AnnData is validated against the registry.

  • reference_model (str | BaseModelClass) – Either an already instantiated model of the same class, or a path to saved outputs for reference model.

  • inplace_subset_query_vars (bool (default: False)) – Whether to subset and rearrange query vars inplace based on vars used to train reference model.

  • accelerator (str (default: 'auto')) – Supports passing different accelerator types (“cpu”, “gpu”, “tpu”, “ipu”, “hpu”, “mps, “auto”) as well as custom accelerator instances.

  • device (int | str (default: 'auto')) – The device to use. Can be set to a non-negative index (int or str) or “auto” for automatic selection based on the chosen accelerator. If set to “auto” and accelerator is not determined to be “cpu”, then device will be set to the first available device.

  • unfrozen (bool (default: False)) – Override all other freeze options for a fully unfrozen model

  • freeze_dropout (bool (default: False)) – Whether to freeze dropout during training

  • freeze_expression (bool (default: True)) – Freeze neurons corersponding to expression in first layer

  • freeze_decoder_first_layer (bool (default: True)) – Freeze neurons corersponding to first layer in decoder

  • freeze_batchnorm_encoder (bool (default: True)) – Whether to freeze batchnorm weight and bias during training for encoder

  • freeze_batchnorm_decoder (bool (default: False)) – Whether to freeze batchnorm weight and bias during training for decoder

  • freeze_classifier (bool (default: True)) – Whether to freeze classifier completely. Only applies to SCANVI.

static ArchesMixin.prepare_query_anndata(adata, reference_model, return_reference_var_names=False, inplace=True)[source]#

Prepare data for query integration.

This function will return a new AnnData object with padded zeros for missing features, as well as correctly sorted features.

Parameters:
  • adata (AnnData) – AnnData organized in the same way as data used to train model. It is not necessary to run setup_anndata, as AnnData is validated against the registry.

  • reference_model (str | BaseModelClass) – Either an already instantiated model of the same class, or a path to saved outputs for reference model.

  • return_reference_var_names (bool (default: False)) – Only load and return reference var names if True.

  • inplace (bool (default: True)) – Whether to subset and rearrange query vars inplace or return new AnnData.

Return type:

AnnData | Index | None

Returns:

Query adata ready to use in load_query_data unless return_reference_var_names in which case a pd.Index of reference var names is returned.

static ArchesMixin.prepare_query_mudata(mdata, reference_model, return_reference_var_names=False, inplace=True)[source]#

Prepare multimodal dataset for query integration.

This function will return a new MuData object such that the AnnData objects for individual modalities are given padded zeros for missing features, as well as correctly sorted features.

Parameters:
  • mdata (MuData) – MuData organized in the same way as data used to train model. It is not necessary to run setup_mudata, as MuData is validated against the registry.

  • reference_model (str | BaseModelClass) – Either an already instantiated model of the same class, or a path to saved outputs for reference model.

  • return_reference_var_names (bool (default: False)) – Only load and return reference var names if True.

  • inplace (bool (default: True)) – Whether to subset and rearrange query vars inplace or return new MuData.

Return type:

MuData | dict[str, Index] | None

Returns:

Query mudata ready to use in load_query_data unless return_reference_var_names in which case a dictionary of pd.Index of reference var names is returned.