scvi.hub.HubModel#
- class scvi.hub.HubModel(local_dir, metadata=None, model_card=None)[source]#
Wrapper for
BaseModelClass
backed by HuggingFace Hub.- Parameters:
local_dir (
str
) – Local directory where the data and pre-trained model reside.metadata (
HubMetadata
|str
|None
(default:None
)) – Either an instance ofHubMetadata
that contains the required metadata for this model, or a path to a file on disk where this metadata can be read from.model_card (
HubModelCardHelper
|ModelCard
|str
|None
(default:None
)) – The model card for this pre-trained model. Model card is a markdown file that describes the pre-trained model/data and is displayed on HuggingFace. This can be either an instance ofModelCard
or an instance ofHubModelCardHelper
that wraps the model card or a path to a file on disk where the model card can be read from.
Notes
See further usage examples in the following tutorials:
Attributes table#
Returns the data for this model. |
|
Returns the training data for this model. |
|
The local directory where the data and pre-trained model reside. |
|
The metadata for this model. |
|
Returns the model object for this hub model. |
|
The model card for this model. |
Methods table#
|
Loads the model. |
|
Download the given model repo from huggingface. |
|
Download a |
|
Push this model to huggingface. |
|
Upload the |
Reads the data from disk ( |
|
Downloads the large training adata. |
|
|
Save the model card and metadata to the model directory. |
Attributes#
- HubModel.adata[source]#
Returns the data for this model.
If the data has not been loaded yet, this will call
read_adata()
. Otherwise, it will simply return the loaded data.
- HubModel.large_training_adata[source]#
Returns the training data for this model.
If the data has not been loaded yet, this will call
read_large_training_adata()
, which will attempt to download from the source url. Otherwise, it will simply return the loaded data.
- HubModel.model[source]#
Returns the model object for this hub model.
If the model has not been loaded yet, this will call
load_model()
. Otherwise, it will simply return the loaded model.
Methods#
- HubModel.load_model(adata=None, accelerator='auto', device='auto')[source]#
Loads the model.
- Parameters:
adata (
AnnData
|None
(default:None
)) – The data to load the model with, if not None. If None, we’ll try to load the model using the data atself._adata_path
. If that file does not exist, we’ll try to load the model usinglarge_training_adata()
. If that does not exist either, we’ll error out.%(param_accelerator)s
%(param_device)s
- classmethod HubModel.pull_from_huggingface_hub(repo_name, cache_dir=None, revision=None, pull_anndata=True, **kwargs)[source]#
Download the given model repo from huggingface.
The model, its card, data, metadata are downloaded to a cached location on disk selected by huggingface and an instance of this class is created with that info and returned.
- Parameters:
repo_name (
str
) – ID of the huggingface repo where this model needs to be uploadedcache_dir (
str
|None
(default:None
)) – The directory where the downloaded model artifacts will be cachedrevision (
str
|None
(default:None
)) – The revision to pull from the repo. This can be a branch name, a tag, or a full-length commit hash. If None, the default (latest) revision is pulled.pull_anndata (
bool
(default:True
)) – Whether to pull theAnnData
object associated with the model. IfTrue
but the file does not exist, will fail silently.kwargs – Additional keyword arguments to pass to
snapshot_download()
.
- classmethod HubModel.pull_from_s3(cls, s3_bucket, s3_path, pull_anndata=True, cache_dir=None, unsigned=False, **kwargs)[source]#
Download a
HubModel
from an S3 bucket.Requires boto3 to be installed.
- Parameters:
s3_bucket (
str
) – The S3 bucket from which to download the model.s3_path (
str
) – The S3 path to the saved model.pull_anndata (
bool
(default:True
)) – Whether to pull theAnnData
object associated with the model.cache_dir (
str
|None
(default:None
)) – The directory where the downloaded model files will be cached. Defaults to a temporary directory created withtempfile.mkdtemp()
.unsigned (
bool
(default:False
)) – Whether to use unsigned requests. IfTrue
andconfig
is passed inkwargs
,config
will be overwritten.**kwargs – Keyword arguments passed into
client()
.
- Return type:
- Returns:
The pretrained model specified by the given S3 bucket and path.
- HubModel.push_to_huggingface_hub(repo_name, repo_token, repo_create=False, push_anndata=True, repo_create_kwargs=None, **kwargs)[source]#
Push this model to huggingface.
If the dataset is too large to upload to huggingface, this will raise an exception prompting the user to upload the data elsewhere. Otherwise, the data, model card, and metadata are all uploaded to the given model repo.
- Parameters:
repo_name (
str
) – ID of the huggingface repo where this model needs to be uploadedrepo_token (
str
) – huggingface API token with write permissionsrepo_create (
bool
(default:False
)) – Whether to create the repopush_anndata (
bool
(default:True
)) – Whether to push theAnnData
object associated with the model.repo_create_kwargs (
dict
|None
(default:None
)) – Keyword arguments passed intocreate_repo()
ifrepo_create=True
.**kwargs – Additional keyword arguments passed into
upload_file()
.
- HubModel.push_to_s3(s3_bucket, s3_path, push_anndata=True, **kwargs)[source]#
Upload the
HubModel
to an S3 bucket.Requires boto3 to be installed.
- HubModel.read_adata()[source]#
Reads the data from disk (
self._adata_path
).Reads if it exists. Otherwise, this is a no-op.
- Return type:
- HubModel.read_large_training_adata()[source]#
Downloads the large training adata.
If it exists, then load it into memory. Otherwise, this is a no-op.
- Return type:
Notes
The large training data url can be a cellxgene explorer session url. However it cannot be a self-hosted session. In other words, it must be from the cellxgene portal (https://cellxgene.cziscience.com/).