New in 0.8.0 (2020-12-17)#
It is now possible to iteratively update these models with new samples, without altering the model for the “reference” population. Here we use the scArches method. For usage, please see the tutorial in the user guide.
To enable scArches in our models, we added a few new options. The first is
encode_covariates, which is an
SCVI option to encode the one-hotted
batch covariate. We also allow users to exchange batch norm in the encoder and decoder with layer norm, which can be though of as batch norm but per cell.
As the layer norm we use has no parameters, it’s a bit faster than models with batch norm. We don’t find many differences between using batch norm or layer norm
in our models, though we have kept defaults the same in this case. To run scArches effectively, batch norm should be exhanged with layer norm.
Empirical initialization of protein background parameters with totalVI#
The learned prior parameters for the protein background were randomly initialized. Now, they can be set with the
TOTALVI. This option fits a two-component Gaussian mixture model per cell, separating those proteins that are background
for the cell and those that are foreground, and aggregates the learned mean and variance of the smaller component across cells. This computation is done
per batch, if the
batch_key was registered. We emphasize this is just for the initialization of a learned parameter in the model.
Use observed library size option#
Many of our models like
TOTALVI learn a latent library size variable.
use_observed_lib_size may now be passed on model initialization. We have set this as
True by default,
as we see no regression in performance, and training is a bit faster.
To facilitate these enhancements, saved
TOTALVImodels from previous versions will not load properly. This is due to an architecture change of the totalVI encoder, related to latent library size handling.
The default latent distribtuion for
Autotune was removed from this release. We could not maintain the code given the new API changes and we will soon have alternative ways to tune hyperparameters.
Protein names during
setup_anndataare now stored in
adata.uns["_scvi"]["protein_names"], instead of
Fixed an issue where the unlabeled category affected the SCANVI architecture prior distribution. Unfortunately, by fixing this bug, loading previously trained (<v0.8.0)
SCANVImodels will fail.