scvi.hub.HubModel#

class scvi.hub.HubModel(local_dir, metadata=None, model_card=None)[source]#

Provides functionality to interact with the scvi-hub backed by huggingface.

Parameters:
  • local_dir (str) – Local directory where the data and pre-trained model reside.

  • metadata (Union[HubMetadata, str, None] (default: None)) – Either an instance of HubMetadata that contains the required metadata for this model, or a path to a file on disk where this metadata can be read from.

  • model_card (Union[HubModelCardHelper, ModelCard, str, None] (default: None)) – The model card for this pre-trained model. Model card is a markdown file that describes the pre-trained model/data and is displayed on huggingface. This can be either an instance of ModelCard or an instance of HubModelCardHelper that wraps the model card or a path to a file on disk where the model card can be read from.

Notes

See further usage examples in the following tutorials:

  1. Using scvi-hub to download pretrained scvi-tools models

  2. Using scvi-hub to upload pretrained scvi-tools models

Attributes table#

adata

Returns the data for this model.

large_training_adata

Returns the training data for this model, which might be too large to reside within the hub model.

metadata

The metadata for this model.

model

Returns the model object for this hub model.

model_card

The model card for this model.

Methods table#

load_model([adata])

Loads the model.

pull_from_huggingface_hub(repo_name[, ...])

Download the given model repo from huggingface.

push_to_huggingface_hub(repo_name, ...)

Push this model to huggingface.

read_adata()

Reads the data from disk (self._adata_path) if it exists.

read_large_training_adata()

Downloads the large training adata, if it exists, then load it into memory.

Attributes#

adata

HubModel.adata[source]#

Returns the data for this model.

If the data has not been loaded yet, this will call read_adata(). Otherwise, it will simply return the loaded data.

large_training_adata

HubModel.large_training_adata[source]#

Returns the training data for this model, which might be too large to reside within the hub model.

If the data has not been loaded yet, this will call read_large_training_adata(), which will attempt to download from the source url. Otherwise, it will simply return the loaded data.

metadata

HubModel.metadata[source]#

The metadata for this model.

model

HubModel.model[source]#

Returns the model object for this hub model.

If the model has not been loaded yet, this will call load_model(). Otherwise, it will simply return the loaded model.

model_card

HubModel.model_card[source]#

The model card for this model.

Methods#

load_model

HubModel.load_model(adata=None)[source]#

Loads the model.

Parameters:

adata (Optional[AnnData] (default: None)) – The data to load the model with, if not None. If None, we’ll try to load the model using the data at self._adata_path. If that file does not exist, we’ll try to load the model using large_training_adata(). If that does not exist either, we’ll error out.

pull_from_huggingface_hub

classmethod HubModel.pull_from_huggingface_hub(repo_name, cache_dir=None, revision=None, **kwargs)[source]#

Download the given model repo from huggingface.

The model, its card, data, metadata are downloaded to a cached location on disk selected by huggingface and an instance of this class is created with that info and returned.

Parameters:
  • repo_name (str) – ID of the huggingface repo where this model needs to be uploaded

  • cache_dir (Optional[str] (default: None)) – The directory where the downloaded model artifacts will be cached

  • revision (Optional[str] (default: None)) – The revision to pull from the repo. This can be a branch name, a tag, or a full-length commit hash. If None, the default (latest) revision is pulled.

  • kwargs – Additional keyword arguments to pass to snapshot_download().

push_to_huggingface_hub

HubModel.push_to_huggingface_hub(repo_name, repo_token, repo_create)[source]#

Push this model to huggingface.

If the dataset is too large to upload to huggingface, this will raise an exception prompting the user to upload the data elsewhere. Otherwise, the data, model card, and metadata are all uploaded to the given model repo.

Parameters:
  • repo_name (str) – ID of the huggingface repo where this model needs to be uploaded

  • repo_token (str) – huggingface API token with write permissions

  • repo_create (bool) – Whether to create the repo

read_adata

HubModel.read_adata()[source]#

Reads the data from disk (self._adata_path) if it exists. Otherwise, this is a no-op.

read_large_training_adata

HubModel.read_large_training_adata()[source]#

Downloads the large training adata, if it exists, then load it into memory. Otherwise, this is a no-op

Notes

The large training data url can be a cellxgene explorer session url. However it cannot be a self-hosted session. In other words, it must be from the cellxgene portal (https://cellxgene.cziscience.com/).