scvi.data.fields.ProteinObsmField#

class scvi.data.fields.ProteinObsmField(*base_field_args, use_batch_mask=True, batch_field=None, **base_field_kwargs)[source]#

An AnnDataField for an protein data stored in an .obsm field of an AnnData object.

For usage with the TotalVI model. Computes an additional mask which indicates where batches are missing protein data.

Parameters:
registry_key

Key to register field under in data registry.

obsm_key

Key to access the field in the AnnData .obsm mapping.

use_batch_mask : bool (default: True)

If True, computes a batch mask over the data for missing protein data. Requires batch_key to be not None.

batch_key

Key corresponding to the .obs field where batch indices are stored. Used for computing a batch mask over the data for missing protein data.

colnames_uns_key

Key to access column names corresponding to each column of the .obsm field in the AnnData .uns mapping. Only used when .obsm data is a np.ndarray, not a pd.DataFrame.

is_count_data

If True, checks if the data are counts during validation.

correct_data_format

If True, checks and corrects that the AnnData field is C_CONTIGUOUS and csr if it is dense numpy or sparse respectively.

Attributes table#

COLUMN_NAMES_KEY

PROTEIN_BATCH_MASK

attr_key

The key of the data field within the relevant AnnData attribute.

attr_name

The name of the AnnData attribute where the data is stored.

is_empty

Returns True if the field is empty as a function of its kwargs.

mod_key

The modality key of the data field within the MuData (if applicable).

registry_key

The key that is referenced by models via a data loader.

Methods table#

get_data_registry()

Returns a nested dictionary which describes the mapping to the data field.

get_field_data(adata)

Returns the requested data as determined by the field for a given AnnData/MuData object.

get_summary_stats(state_registry)

Returns a dictionary comprising of summary statistics relevant to the field.

register_field(adata)

Sets up the AnnData/MuData object and creates a mapping for scvi-tools models to use.

transfer_field(state_registry, adata_target, ...)

Takes an existing scvi-tools setup dictionary and transfers the same setup to the target AnnData.

validate_field(adata)

Validates whether an AnnData/MuData object is compatible with this field definition.

view_state_registry(state_registry)

Returns a rich.table.Table summarizing a state registry produced by this field.

Attributes#

COLUMN_NAMES_KEY#

ProteinObsmField.COLUMN_NAMES_KEY = 'column_names'#

PROTEIN_BATCH_MASK#

ProteinObsmField.PROTEIN_BATCH_MASK = 'protein_batch_mask'#

attr_key#

ProteinObsmField.attr_key[source]#
Return type:

str

attr_name#

ProteinObsmField.attr_name[source]#
Return type:

str

is_empty#

ProteinObsmField.is_empty[source]#
Return type:

bool

mod_key#

ProteinObsmField.mod_key[source]#

The modality key of the data field within the MuData (if applicable).

Return type:

str | NoneOptional[str]

registry_key#

ProteinObsmField.registry_key[source]#
Return type:

str

Methods#

get_data_registry#

ProteinObsmField.get_data_registry()[source]#

Returns a nested dictionary which describes the mapping to the data field.

The dictionary is of the form {“mod_key”: mod_key, “attr_name”: attr_name, “attr_key”: attr_key}. This mapping is then combined with the mappings of other fields to make up the data registry.

Return type:

dict

get_field_data#

ProteinObsmField.get_field_data(adata)[source]#

Returns the requested data as determined by the field for a given AnnData/MuData object.

Return type:

ndarray | DataFrameUnion[ndarray, DataFrame]

get_summary_stats#

ProteinObsmField.get_summary_stats(state_registry)[source]#

Returns a dictionary comprising of summary statistics relevant to the field.

Parameters:
state_registry : dict

Dictionary returned by register_field(). Summary stats should always be a function of information stored in this dictionary.

Return type:

dict

Returns:

summary_stats_dict The dictionary is of the form {summary_stat_name: summary_stat_value}. This mapping is then combined with the mappings of other fields to make up the summary stats mapping.

register_field#

ProteinObsmField.register_field(adata)[source]#

Sets up the AnnData/MuData object and creates a mapping for scvi-tools models to use.

Return type:

dict

Returns:

dict A dictionary containing any additional state required for scvi-tools models not stored directly on the AnnData/MuData object.

transfer_field#

ProteinObsmField.transfer_field(state_registry, adata_target, **kwargs)[source]#

Takes an existing scvi-tools setup dictionary and transfers the same setup to the target AnnData.

Used when one is running a pretrained model on a new AnnData object, which requires the mapping from the original data to be applied to the new AnnData object.

Parameters:
state_registry : dict

state_registry dictionary created after registering an AnnData using an AnnDataManager object.

adata_target : AnnData

AnnData/MuData object that is being registered.

**kwargs

Keyword arguments to modify transfer behavior.

Return type:

dict

Returns:

dict A dictionary containing any additional state required for scvi-tools models not stored directly on the AnnData object.

validate_field#

ProteinObsmField.validate_field(adata)[source]#

Validates whether an AnnData/MuData object is compatible with this field definition.

Return type:

None

view_state_registry#

ProteinObsmField.view_state_registry(state_registry)[source]#

Returns a rich.table.Table summarizing a state registry produced by this field.

Parameters:
state_registry : dict

Dictionary returned by register_field(). Printed summary should always be a function of information stored in this dictionary.

Return type:

Table | NoneOptional[Table]

Returns:

state_registry_summary Optional rich.table.Table summarizing the state_registry.