scvi.data.pbmc_dataset

scvi.data.pbmc_dataset(save_path='data/', run_setup_anndata=True, remove_extracted_data=True)[source]

Loads pbmc dataset.

We considered scRNA-seq data from two batches of peripheral blood mononuclear cells (PBMCs) from a healthy donor (4K PBMCs and 8K PBMCs). We derived quality control metrics using the cellrangerRkit R package (v. 1.1.0). Quality metrics were extracted from CellRanger throughout the molecule specific information file. After filtering, we extract 12,039 cells with 10,310 sampled genes and get biologically meaningful clusters with the software Seurat. We then filter genes that we could not match with the bulk data used for differential expression to be left with g = 3346.

Parameters
save_path : strstr (default: 'data/')

Location to use when saving/loading the data.

run_setup_anndata : boolbool (default: True)

If true, runs setup_anndata() on dataset before returning

remove_extracted_data : boolbool (default: True)

If true, will remove the folder the data was extracted to

Return type

AnnDataAnnData

Returns

AnnData with batch info (.obs['batch']), label info (.obs['labels'])

Examples

>>> import scvi
>>> adata = scvi.data.pbmc_dataset()