- scvi.data.pbmc_dataset(save_path='data/', run_setup_anndata=True, remove_extracted_data=True)¶
Loads pbmc dataset.
We considered scRNA-seq data from two batches of peripheral blood mononuclear cells (PBMCs) from a healthy donor (4K PBMCs and 8K PBMCs). We derived quality control metrics using the cellrangerRkit R package (v. 1.1.0). Quality metrics were extracted from CellRanger throughout the molecule specific information file. After filtering, we extract 12,039 cells with 10,310 sampled genes and get biologically meaningful clusters with the software Seurat. We then filter genes that we could not match with the bulk data used for differential expression to be left with g = 3346.
- Return type
AnnData with batch info (
.obs['batch']), label info (
>>> import scvi >>> adata = scvi.data.pbmc_dataset()