scvi.train.TrainingPlan.training_step

TrainingPlan.training_step(batch, batch_idx, optimizer_idx=0)[source]

Here you compute and return the training loss and some additional metrics for e.g. the progress bar or logger.

Parameters
batch : Tensor | (Tensor, …) | [Tensor, …]

The output of your DataLoader. A tensor, tuple or list.

batch_idx : int

Integer displaying index of this batch

optimizer_idx : int

When using multiple optimizers, this argument will also be present.

hiddens : Tensor

Passed in if :paramref:`~pytorch_lightning.core.lightning.LightningModule.truncated_bptt_steps` > 0.

Returns

Any of.

  • Tensor - The loss tensor

  • dict - A dictionary. Can include any keys, but must include the key 'loss'

  • None - Training will skip to the next batch

Note

Returning None is currently not supported for multi-GPU or TPU, or with 16-bit precision enabled.

In this step you’d normally do the forward pass and calculate the loss for a batch. You can also do fancier things like multiple forward passes or something model specific.

Example:

def training_step(self, batch, batch_idx):
    x, y, z = batch
    out = self.encoder(x)
    loss = self.loss(out, x)
    return loss

If you define multiple optimizers, this step will be called with an additional optimizer_idx parameter.

# Multiple optimizers (e.g.: GANs)
def training_step(self, batch, batch_idx, optimizer_idx):
    if optimizer_idx == 0:
        # do training_step with encoder
    if optimizer_idx == 1:
        # do training_step with decoder

If you add truncated back propagation through time you will also get an additional argument with the hidden states of the previous step.

# Truncated back-propagation through time
def training_step(self, batch, batch_idx, hiddens):
    # hiddens are the hidden states from the previous truncated backprop step
    ...
    out, hiddens = self.lstm(data, hiddens)
    ...
    return {'loss': loss, 'hiddens': hiddens}

Note

The loss value shown in the progress bar is smoothed (averaged) over the last values, so it differs from the actual loss returned in train/validation step.