UnsupervisedTrainer¶
-
class
scvi.inference.
UnsupervisedTrainer
(model, gene_dataset, train_size=0.9, test_size=None, n_iter_kl_warmup=None, n_epochs_kl_warmup=400, normalize_loss=None, **kwargs)[source]¶ Bases:
scvi.inference.trainer.Trainer
Class for unsupervised training of an autoencoder.
- Parameters
model – A model instance from class
VAE
,VAEC
,SCANVI
,AutoZIVAE
gene_dataset (
GeneExpressionDataset
GeneExpressionDataset
) – A gene_dataset instance likeCortexDataset()
train_size (
int
,float
Union
[int
,float
]) – The train size, a float between 0 and 1 representing proportion of dataset to use for training to use Default:0.9
.test_size (
int
,float
,None
Union
[int
,float
,None
]) – The test size, a float between 0 and 1 representing proportion of dataset to use for testing to use Default:None
, which is equivalent to data not in the train set. Iftrain_size
andtest_size
do not add to 1 then the remaining samples are added to avalidation_set
.**kwargs – Other keywords arguments from the general Trainer class.
- Other Parameters
n_epochs_kl_warmup – Number of epochs for linear warmup of KL(q(z|x)||p(z)) term. After n_epochs_kl_warmup, the training objective is the ELBO. This might be used to prevent inactivity of latent units, and/or to improve clustering of latent space, as a long warmup turns the model into something more of an autoencoder. Be aware that large datasets should avoid this mode and rely on n_iter_kl_warmup. If this parameter is not None, then it overrides any choice of n_iter_kl_warmup.
n_iter_kl_warmup – Number of iterations for warmup (useful for bigger datasets) int(128*5000/400) is a good default value.
normalize_loss – A boolean determining whether the loss is divided by the total number of samples used for training. In particular, when the global KL divergence is equal to 0 and the division is performed, the loss for a minibatchis is equal to the average of reconstruction losses and KL divergences on the minibatch. Default:
None
, which is equivalent to setting False when the model is an instance from classAutoZIVAE
and True otherwise.
Examples
>>> gene_dataset = CortexDataset() >>> vae = VAE(gene_dataset.nb_genes, n_batch=gene_dataset.n_batches * False, ... n_labels=gene_dataset.n_labels)
>>> infer = VariationalInference(gene_dataset, vae, train_size=0.5) >>> infer.train(n_epochs=20, lr=1e-3)
Notes
Two parameters can help control the training KL annealing If your applications rely on the posterior quality, (i.e. differential expression, batch effect removal), ensure the number of total epochs (or iterations) exceed the number of epochs (or iterations) used for KL warmup
Attributes Summary
Methods Summary
loss
(tensors[, feed_labels])Attributes Documentation
-
default_metrics_to_monitor
= ['elbo']¶
-
kl_weight
¶
-
posteriors_loop
¶
Methods Documentation