scvi.model.base.SemisupervisedTrainingMixin

scvi.model.base.SemisupervisedTrainingMixin#

class scvi.model.base.SemisupervisedTrainingMixin[source]#: General purpose semisupervised train, predict, and interoperability methods.

Methods table#

`get_ranked_features`([adata, attrs])	Get the ranked gene list based on highest attributions.
`predict`([adata, indices, soft, batch_size, ...])	Return cell label predictions.
`shap_adata_predict`(X)	SHAP Operator (gives soft predictions gives data X)
`shap_predict`([adata, indices, shap_args])	Run SHAP interpreter for a trained model and gives back shap values
`train`([max_epochs, n_samples_per_label, ...])	Train the model.

Methods#

SemisupervisedTrainingMixin.get_ranked_features(adata=None, attrs=None)[source]#

Get the ranked gene list based on highest attributions.

Parameters:

adata (AnnData | MuData | None (default: None)) – AnnData or MuData object that has been registered via the corresponding setup method in the model class.
attrs (numpy.ndarray) – Attributions matrix.

Return type:

DataFrame

Returns:

pandas.DataFrame A pandas dataframe containing the ranked attributions for each gene

Examples

>>> attrs_df = model.get_ranked_features(attrs)

SemisupervisedTrainingMixin.predict(adata=None, indices=None, soft=False, batch_size=None, use_posterior_mean=True, ig_interpretability=False, ig_args=None, dataloader=None)[source]#

Return cell label predictions.

Parameters:

adata (default: None) – AnnData or MuData object that has been registered via the corresponding setup method in the model class.
indices (default: None) – Return probabilities for each class label.
soft (default: False) – If True, returns per-class probabilities
batch_size (default: None) – Minibatch size for data loading into a model. Defaults to scvi.settings.batch_size.
use_posterior_mean (default: True) – If True, uses the mean of the posterior distribution to predict celltype labels. Otherwise, uses a sample from the posterior distribution - this means that the predictions will be stochastic.
ig_interpretability (default: False) – If True, run the integrated circuits interpretability per sample and returns a score matrix, in which for each sample we score each gene for its contribution to the sample prediction
ig_args (default: None) – Keyword args for IntegratedGradients
dataloader (default: None) – An iterator over minibatches of data on which to compute the metric. The minibatches should be formatted as a dictionary of Tensor with keys as expected by the model. If None, a dataloader is created from adata.

SemisupervisedTrainingMixin.shap_adata_predict(X)[source]#: SHAP Operator (gives soft predictions gives data X)

SemisupervisedTrainingMixin.shap_predict(adata=None, indices=None, shap_args=None)[source]#: Run SHAP interpreter for a trained model and gives back shap values

SemisupervisedTrainingMixin.train(max_epochs=None, n_samples_per_label=None, check_val_every_n_epoch=None, train_size=0.9, validation_size=None, shuffle_set_split=True, batch_size=128, accelerator='auto', devices='auto', adversarial_classifier=None, datasplitter_kwargs=None, plan_config=None, plan_kwargs=None, datamodule=None, trainer_config=None, **trainer_kwargs)[source]#

Train the model.

Parameters:

max_epochs (int | None (default: None)) – Number of passes through the dataset for semisupervised training.
n_samples_per_label (float | None (default: None)) – Number of subsamples for each label class to sample per epoch. By default, there is no label subsampling.
check_val_every_n_epoch (int | None (default: None)) – Frequency with which metrics are computed on the data for the validation set for both the unsupervised and semisupervised trainers. If you’d like a different frequency for the semisupervised trainer, set check_val_every_n_epoch in semisupervised_train_kwargs.
train_size (float (default: 0.9)) – Size of the training set in the range [0.0, 1.0].
validation_size (float | None (default: None)) – Size of the test set. If None, defaults to 1 - train_size. If train_size + validation_size < 1, the remaining cells belong to a test set.
shuffle_set_split (bool (default: True)) – Whether to shuffle indices before splitting. If False, the val, train, and test set are split in the sequential order of the data according to validation_size and train_size percentages.
batch_size (int (default: 128)) – Minibatch size to use during training.
accelerator (str (default: 'auto')) – Supports passing different accelerator types (“cpu”, “gpu”, “tpu”, “ipu”, “hpu”, “mps, “auto”) as well as custom accelerator instances.
devices (int | list[int] | str (default: 'auto')) – The devices to use. Can be set to a non-negative index (int or str), a sequence of device indices (list or comma-separated str), the value -1 to indicate all available devices, or “auto” for automatic selection based on the chosen accelerator. If set to “auto” and accelerator is not determined to be “cpu”, then devices will be set to the first available device.
adversarial_classifier (bool | None (default: None)) – Whether to use adversarial classifier in the latent space. This helps mixing when there are missing proteins in any of the batches. Defaults to True is missing proteins are detected.
datasplitter_kwargs (dict | None (default: None)) – Additional keyword arguments passed into SemiSupervisedDataSplitter.
plan_kwargs (Mapping[str, Any] | KwargsConfig | None (default: None)) – Keyword args for SemiSupervisedTrainingPlan. Keyword arguments passed to train() will overwrite values present in plan_kwargs, when appropriate.
plan_config (Mapping[str, Any] | KwargsConfig | None (default: None)) – Configuration object or mapping used to build SemiSupervisedTrainingPlan. Values in plan_kwargs and explicit arguments take precedence.
datamodule (LightningDataModule | None (default: None)) – EXPERIMENTAL A LightningDataModule instance to use for training in place of the default DataSplitter. Can only be passed in if the model was not initialized with AnnData.
trainer_config (Mapping[str, Any] | KwargsConfig | None (default: None)) – Configuration object or mapping used to build Trainer. Values in trainer_kwargs and explicit arguments take precedence.
**trainer_kwargs – Other keyword args for Trainer.

scvi.model.base.SemisupervisedTrainingMixin

Contents

scvi.model.base.SemisupervisedTrainingMixin#

Methods table#

Methods#