UnsupervisedTrainer

class scvi.inference.UnsupervisedTrainer(model, gene_dataset, train_size=0.9, test_size=None, n_iter_kl_warmup=None, n_epochs_kl_warmup=400, normalize_loss=None, **kwargs)[source]

Bases: scvi.inference.trainer.Trainer

Class for unsupervised training of an autoencoder.

Parameters
  • model – A model instance from class VAE, VAEC, SCANVI, AutoZIVAE

  • gene_dataset (GeneExpressionDatasetGeneExpressionDataset) – A gene_dataset instance like CortexDataset()

  • train_size (int, floatUnion[int, float]) – The train size, a float between 0 and 1 representing proportion of dataset to use for training to use Default: 0.9.

  • test_size (int, float, NoneUnion[int, float, None]) – The test size, a float between 0 and 1 representing proportion of dataset to use for testing to use Default: None, which is equivalent to data not in the train set. If train_size and test_size do not add to 1 then the remaining samples are added to a validation_set.

  • **kwargs – Other keywords arguments from the general Trainer class.

Other Parameters
  • n_epochs_kl_warmup – Number of epochs for linear warmup of KL(q(z|x)||p(z)) term. After n_epochs_kl_warmup, the training objective is the ELBO. This might be used to prevent inactivity of latent units, and/or to improve clustering of latent space, as a long warmup turns the model into something more of an autoencoder. Be aware that large datasets should avoid this mode and rely on n_iter_kl_warmup. If this parameter is not None, then it overrides any choice of n_iter_kl_warmup.

  • n_iter_kl_warmup – Number of iterations for warmup (useful for bigger datasets) int(128*5000/400) is a good default value.

  • normalize_loss – A boolean determining whether the loss is divided by the total number of samples used for training. In particular, when the global KL divergence is equal to 0 and the division is performed, the loss for a minibatchis is equal to the average of reconstruction losses and KL divergences on the minibatch. Default: None, which is equivalent to setting False when the model is an instance from class AutoZIVAE and True otherwise.

Examples

>>> gene_dataset = CortexDataset()
>>> vae = VAE(gene_dataset.nb_genes, n_batch=gene_dataset.n_batches * False,
... n_labels=gene_dataset.n_labels)
>>> infer = VariationalInference(gene_dataset, vae, train_size=0.5)
>>> infer.train(n_epochs=20, lr=1e-3)

Notes

Two parameters can help control the training KL annealing If your applications rely on the posterior quality, (i.e. differential expression, batch effect removal), ensure the number of total epochs (or iterations) exceed the number of epochs (or iterations) used for KL warmup

Attributes Summary

default_metrics_to_monitor

kl_weight

posteriors_loop

Methods Summary

loss(tensors[, feed_labels])

on_training_begin()

on_training_end()

Attributes Documentation

default_metrics_to_monitor = ['elbo']
kl_weight
posteriors_loop

Methods Documentation

loss(tensors, feed_labels=True)[source]
on_training_begin()[source]
on_training_end()[source]