scvi.data.AnnDataManager#

class scvi.data.AnnDataManager(fields=None, setup_method_args=None, validation_checks=None)[source]#

Provides an interface to validate and process an AnnData object for use in scvi-tools.

A class which wraps a collection of AnnDataField instances and provides an interface to validate and process an AnnData object with respect to the fields.

Parameters:
  • fields (list[type[BaseAnnDataField]] | None (default: None)) – List of AnnDataFields to intialize with.

  • setup_method_args (dict | None (default: None)) – Dictionary describing the model and arguments passed in by the user to setup this AnnDataManager.

  • validation_checks (AnnDataManagerValidationCheck | None (default: None)) – DataClass specifying which global validation checks to run on the data object.

Examples

>>> fields = [LayerField("counts", "raw_counts")]
>>> adata_manager = AnnDataManager(fields=fields)
>>> adata_manager.register_fields(adata)

Notes

This class is not initialized with a specific AnnData object, but later sets self.adata via register_fields(). This decouples the generalized definition of the scvi-tools interface with the registration of an instance of data.

See further usage examples in the following tutorials:

  1. Data handling in scvi-tools

Attributes table#

adata_uuid

Returns the UUID for the AnnData object registered with this instance.

data_registry

Returns the data registry for the AnnData object registered with this instance.

registry

Returns the top-level registry dictionary for the AnnData object registered with this instance as an attrdict.

summary_stats

Returns the summary stats for the AnnData object registered with this instance.

Methods table#

create_torch_dataset([indices, ...])

Creates a torch dataset from the AnnData object registered with this instance.

get_from_registry(registry_key)

Returns the object in AnnData associated with the key in the data registry.

get_state_registry(registry_key)

Returns the state registry for the AnnDataField registered with this instance.

register_fields(adata[, source_registry])

Registers each field associated with this instance with the AnnData object.

register_new_fields(fields)

Register new fields to a manager instance.

transfer_fields(adata_target, **kwargs)

Transfers an existing setup to each field associated with this instance with the target AnnData object.

update_setup_method_args(setup_method_args)

Update setup method args.

validate()

Checks if AnnData was last setup with this AnnDataManager instance and reregisters it if not.

view_registry([hide_state_registries])

Prints summary of the registry.

view_setup_method_args(registry)

Prints setup kwargs used to produce a given registry.

Attributes#

AnnDataManager.adata_uuid[source]#

Returns the UUID for the AnnData object registered with this instance.

AnnDataManager.data_registry[source]#

Returns the data registry for the AnnData object registered with this instance.

AnnDataManager.registry[source]#

Returns the top-level registry dictionary for the AnnData object registered with this instance as an attrdict.

AnnDataManager.summary_stats[source]#

Returns the summary stats for the AnnData object registered with this instance.

Methods#

AnnDataManager.create_torch_dataset(indices=None, data_and_attributes=None, load_sparse_tensor=False)[source]#

Creates a torch dataset from the AnnData object registered with this instance.

Parameters:
  • indices (Sequence[int] | Sequence[bool] (default: None)) – The indices of the observations in the adata to use

  • data_and_attributes (list[str] | dict[str, dtype] | None (default: None)) – Dictionary with keys representing keys in data registry (adata_manager.data_registry) and value equal to desired numpy loading type (later made into torch tensor) or list of such keys. A list can be used to subset to certain keys in the event that more tensors than needed have been registered. If None, defaults to all registered data.

  • load_sparse_tensor (bool (default: False)) – EXPERIMENTAL If True, loads data with sparse CSR or CSC layout as a Tensor with the same layout. Can lead to speedups in data transfers to GPUs, depending on the sparsity of the data.

Return type:

AnnTorchDataset

Returns:

AnnTorchDataset

AnnDataManager.get_from_registry(registry_key)[source]#

Returns the object in AnnData associated with the key in the data registry.

Parameters:

registry_key (str) – key of object to get from self.data_registry

Return type:

ndarray | DataFrame

Returns:

The requested data.

AnnDataManager.get_state_registry(registry_key)[source]#

Returns the state registry for the AnnDataField registered with this instance.

Return type:

attrdict

AnnDataManager.register_fields(adata, source_registry=None, **transfer_kwargs)[source]#

Registers each field associated with this instance with the AnnData object.

Either registers or transfers the setup from source_setup_dict if passed in. Sets self.adata.

Parameters:
  • adata (Union[AnnData, MuData]) – AnnData object to be registered.

  • source_registry (dict | None (default: None)) – Registry created after registering an AnnData using an AnnDataManager object.

  • transfer_kwargs – Additional keywords which modify transfer behavior. Only applicable if source_registry is set.

AnnDataManager.register_new_fields(fields)[source]#

Register new fields to a manager instance.

This is useful to augment the functionality of an existing manager.

Parameters:

fields (list[type[BaseAnnDataField]]) – List of AnnDataFields to register

AnnDataManager.transfer_fields(adata_target, **kwargs)[source]#

Transfers an existing setup to each field associated with this instance with the target AnnData object.

Creates a new AnnDataManager instance with the same set of fields. Then, registers the fields with a target AnnData object, incorporating details of the source registry where necessary (e.g. for validation or modified data setup).

Parameters:
  • adata_target (Union[AnnData, MuData]) – AnnData object to be registered.

  • kwargs – Additional keywords which modify transfer behavior.

Return type:

AnnDataManager

AnnDataManager.update_setup_method_args(setup_method_args)[source]#

Update setup method args.

Parameters:

setup_method_args (dict) – This is a bit of a misnomer, this is a dict representing kwargs of the setup method that will be used to update the existing values in the registry of this instance.

AnnDataManager.validate()[source]#

Checks if AnnData was last setup with this AnnDataManager instance and reregisters it if not.

Return type:

None

AnnDataManager.view_registry(hide_state_registries=False)[source]#

Prints summary of the registry.

Parameters:

hide_state_registries (bool (default: False)) – If True, prints a shortened summary without details of each state registry.

Return type:

None

static AnnDataManager.view_setup_method_args(registry)[source]#

Prints setup kwargs used to produce a given registry.

Parameters:

registry (dict) – Registry produced by an AnnDataManager.

Return type:

None