scvi.data.AnnDataManager#

class scvi.data.AnnDataManager(fields=None, setup_method_args=None, validation_checks=None)[source]#

Provides an interface to validate and process an AnnData object for use in scvi-tools.

A class which wraps a collection of AnnDataField instances and provides an interface to validate and process an AnnData object with respect to the fields.

Parameters
  • fields (Optional[list[Type[BaseAnnDataField]]] (default: None)) – List of AnnDataFields to intialize with.

  • setup_method_args (Optional[dict] (default: None)) – Dictionary describing the model and arguments passed in by the user to setup this AnnDataManager.

  • validation_checks (Optional[AnnDataManagerValidationCheck] (default: None)) – DataClass specifying which global validation checks to run on the data object.

Examples

>>> fields = [LayerField("counts", "raw_counts")]
>>> adata_manager = AnnDataManager(fields=fields)
>>> adata_manager.register_fields(adata)

Notes

This class is not initialized with a specific AnnData object, but later sets self.adata via register_fields(). This decouples the generalized definition of the scvi-tools interface with the registration of an instance of data.

See further usage examples in the following tutorials:

  1. Data handling in scvi-tools

Attributes table#

adata_uuid

Returns the UUID for the AnnData object registered with this instance.

data_registry

Returns the data registry for the AnnData object registered with this instance.

registry

Returns the top-level registry dictionary for the AnnData object registered with this instance as an attrdict.

summary_stats

Returns the summary stats for the AnnData object registered with this instance.

Methods table#

create_torch_dataset([indices, ...])

Creates a torch dataset from the AnnData object registered with this instance.

get_from_registry(registry_key)

Returns the object in AnnData associated with the key in the data registry.

get_state_registry(registry_key)

Returns the state registry for the AnnDataField registered with this instance.

register_fields(adata[, source_registry])

Registers each field associated with this instance with the AnnData object.

register_new_fields(fields)

Register new fields to a manager instance.

transfer_fields(adata_target, **kwargs)

Transfers an existing setup to each field associated with this instance with the target AnnData object.

update_setup_method_args(setup_method_args)

Update setup method args.

validate()

Checks if AnnData was last setup with this AnnDataManager instance and reregisters it if not.

view_registry([hide_state_registries])

Prints summary of the registry.

view_setup_method_args(registry)

Prints setup kwargs used to produce a given registry.

Attributes#

AnnDataManager.adata_uuid[source]#

Returns the UUID for the AnnData object registered with this instance.

AnnDataManager.data_registry[source]#

Returns the data registry for the AnnData object registered with this instance.

AnnDataManager.registry[source]#

Returns the top-level registry dictionary for the AnnData object registered with this instance as an attrdict.

AnnDataManager.summary_stats[source]#

Returns the summary stats for the AnnData object registered with this instance.

Methods#

AnnDataManager.create_torch_dataset(indices=None, data_and_attributes=None)[source]#

Creates a torch dataset from the AnnData object registered with this instance.

Parameters
  • indices (Union[Sequence[int], Sequence[bool], None] (default: None)) – The indices of the observations in the adata to use

  • data_and_attributes (Union[list[str], dict[str, dtype], None] (default: None)) – Dictionary with keys representing keys in data registry (adata_manager.data_registry) and value equal to desired numpy loading type (later made into torch tensor) or list of such keys. A list can be used to subset to certain keys in the event that more tensors than needed have been registered. If None, defaults to all registered data.

Return type

AnnTorchDataset

Returns

Torch Dataset

AnnDataManager.get_from_registry(registry_key)[source]#

Returns the object in AnnData associated with the key in the data registry.

Parameters

registry_key (str) – key of object to get from self.data_registry

Return type

ndarray | DataFrame

Returns

The requested data.

AnnDataManager.get_state_registry(registry_key)[source]#

Returns the state registry for the AnnDataField registered with this instance.

Return type

attrdict

AnnDataManager.register_fields(adata, source_registry=None, **transfer_kwargs)[source]#

Registers each field associated with this instance with the AnnData object.

Either registers or transfers the setup from source_setup_dict if passed in. Sets self.adata.

Parameters
  • adata (Union[AnnData, MuData]) – AnnData object to be registered.

  • source_registry (Optional[dict] (default: None)) – Registry created after registering an AnnData using an AnnDataManager object.

  • transfer_kwargs – Additional keywords which modify transfer behavior. Only applicable if source_registry is set.

AnnDataManager.register_new_fields(fields)[source]#

Register new fields to a manager instance.

This is useful to augment the functionality of an existing manager.

Parameters

fields (list[Type[BaseAnnDataField]]) – List of AnnDataFields to register

AnnDataManager.transfer_fields(adata_target, **kwargs)[source]#

Transfers an existing setup to each field associated with this instance with the target AnnData object.

Creates a new AnnDataManager instance with the same set of fields. Then, registers the fields with a target AnnData object, incorporating details of the source registry where necessary (e.g. for validation or modified data setup).

Parameters
  • adata_target (Union[AnnData, MuData]) – AnnData object to be registered.

  • kwargs – Additional keywords which modify transfer behavior.

Return type

AnnDataManager

AnnDataManager.update_setup_method_args(setup_method_args)[source]#

Update setup method args.

Parameters

setup_method_args (dict) – This is a bit of a misnomer, this is a dict representing kwargs of the setup method that will be used to update the existing values in the registry of this instance.

AnnDataManager.validate()[source]#

Checks if AnnData was last setup with this AnnDataManager instance and reregisters it if not.

Return type

None

AnnDataManager.view_registry(hide_state_registries=False)[source]#

Prints summary of the registry.

Parameters

hide_state_registries (bool (default: False)) – If True, prints a shortened summary without details of each state registry.

Return type

None

static AnnDataManager.view_setup_method_args(registry)[source]#

Prints setup kwargs used to produce a given registry.

Parameters

registry (dict) – Registry produced by an AnnDataManager.

Return type

None