scvi.data.fields.ProteinObsmField#

class scvi.data.fields.ProteinObsmField(registry_key, obsm_key, use_batch_mask=True, batch_key=None, colnames_uns_key=None, is_count_data=False, correct_data_format=True)[source]#

An AnnDataField for an protein data stored in an .obsm field of an AnnData object.

For usage with the TotalVI model. Computes an additional mask which indicates where batches are missing protein data.

Parameters
registry_key : str

Key to register field under in data registry.

obsm_key : str

Key to access the field in the AnnData .obsm mapping.

use_batch_mask : bool (default: True)

If True, computes a batch mask over the data for missing protein data. Requires batch_key to be not None.

batch_key : str | NoneOptional[str] (default: None)

Key corresponding to the .obs field where batch indices are stored. Used for computing a batch mask over the data for missing protein data.

colnames_uns_key : str | NoneOptional[str] (default: None)

Key to access column names corresponding to each column of the .obsm field in the AnnData .uns mapping. Only used when .obsm data is a np.ndarray, not a pd.DataFrame.

is_count_data : bool (default: False)

If True, checks if the data are counts during validation.

correct_data_format : bool (default: True)

If True, checks and corrects that the AnnData field is C_CONTIGUOUS and csr if it is dense numpy or sparse respectively.

Attributes table#

COLUMN_NAMES_KEY

PROTEIN_BATCH_MASK

attr_key

The key of the data field within the relevant AnnData attribute.

attr_name

The name of the AnnData attribute where the data is stored.

is_empty

Returns True if the field is empty as a function of its kwargs.

registry_key

The key that is referenced by models via a data loader.

Methods table#

get_data_registry()

Returns a nested dictionary which describes the mapping to the AnnData data field.

get_field_data(adata)

Returns the requested data as determined by the field for a given AnnData object.

get_summary_stats(state_registry)

Returns a dictionary comprising of summary statistics relevant to the field.

register_field(adata)

Sets up the AnnData object and creates a mapping for scvi-tools models to use.

transfer_field(state_registry, adata_target, ...)

Takes an existing scvi-tools setup dictionary and transfers the same setup to the target AnnData.

validate_field(adata)

Validates whether an AnnData object is compatible with this field definition.

view_state_registry(state_registry)

Returns a rich.table.Table summarizing a state registry produced by this field.

Attributes#

COLUMN_NAMES_KEY#

ProteinObsmField.COLUMN_NAMES_KEY = 'column_names'#

PROTEIN_BATCH_MASK#

ProteinObsmField.PROTEIN_BATCH_MASK = 'protein_batch_mask'#

attr_key#

ProteinObsmField.attr_key#
Return type

str

attr_name#

ProteinObsmField.attr_name#
Return type

str

is_empty#

ProteinObsmField.is_empty#
Return type

bool

registry_key#

ProteinObsmField.registry_key#
Return type

str

Methods#

get_data_registry#

ProteinObsmField.get_data_registry()#

Returns a nested dictionary which describes the mapping to the AnnData data field.

The dictionary is of the form {“attr_name”: attr_name, “attr_key”: attr_key}. This mapping is then combined with the mappings of other fields to make up the data registry.

Return type

dict

get_field_data#

ProteinObsmField.get_field_data(adata)#

Returns the requested data as determined by the field for a given AnnData object.

Return type

ndarray | DataFrameUnion[ndarray, DataFrame]

get_summary_stats#

ProteinObsmField.get_summary_stats(state_registry)#

Returns a dictionary comprising of summary statistics relevant to the field.

Parameters
state_registry : dict

Dictionary returned by register_field(). Summary stats should always be a function of information stored in this dictionary.

Return type

dict

Returns

summary_stats_dict The dictionary is of the form {summary_stat_name: summary_stat_value}. This mapping is then combined with the mappings of other fields to make up the summary stats mapping.

register_field#

ProteinObsmField.register_field(adata)[source]#

Sets up the AnnData object and creates a mapping for scvi-tools models to use.

Return type

dict

Returns

dict A dictionary containing any additional state required for scvi-tools models not stored directly on the AnnData object.

transfer_field#

ProteinObsmField.transfer_field(state_registry, adata_target, **kwargs)[source]#

Takes an existing scvi-tools setup dictionary and transfers the same setup to the target AnnData.

Used when one is running a pretrained model on a new AnnData object, which requires the mapping from the original data to be applied to the new AnnData object.

Parameters
state_registry : dict

state_registry dictionary created after registering an AnnData using an AnnDataManager object.

adata_target : AnnData

AnnData object that is being registered.

**kwargs

Keyword arguments to modify transfer behavior.

Return type

dict

Returns

dict A dictionary containing any additional state required for scvi-tools models not stored directly on the AnnData object.

validate_field#

ProteinObsmField.validate_field(adata)#

Validates whether an AnnData object is compatible with this field definition.

Return type

None

view_state_registry#

ProteinObsmField.view_state_registry(state_registry)#

Returns a rich.table.Table summarizing a state registry produced by this field.

Parameters
state_registry : dict

Dictionary returned by register_field(). Printed summary should always be a function of information stored in this dictionary.

Return type

Table | NoneOptional[Table]

Returns

state_registry_summary Optional rich.table.Table summarizing the state_registry.