Datasets

Contents

Datasets#

Import scvi-tools as:

import scvi

Built in data#

Here we host some published datasets that are useful for benchmarking and testing models.

data.cellxgene

Loads a file from cellxgene portal.

data.pbmc_seurat_v4_cite_seq

Dataset of PBMCs measured with CITE-seq (161764 cells).

data.spleen_lymph_cite_seq

Immune cells from the murine spleen and lymph nodes [Gayoso et al., 2021].

data.heart_cell_atlas_subsampled

Combined single cell and single nuclei RNA-Seq data of 485K cardiac cells with annotations.

data.pbmcs_10x_cite_seq

Filtered PBMCs from 10x Genomics profiled with RNA and protein.

data.purified_pbmc_dataset

Purified PBMC dataset from: "Massively parallel digital transcriptional profiling of single cells".

data.dataset_10x

Loads a file from 10x website.

data.brainlarge_dataset

Loads brain-large dataset.

data.pbmc_dataset

Loads pbmc dataset.

data.cortex

Loads cortex dataset.

data.smfish

Loads osmFISH data of mouse cortex cells from the Linarsson lab.

data.synthetic_iid

Synthetic multimodal dataset.

data.breast_cancer_dataset

Loads breast cancer dataset.

data.mouse_ob_dataset

Loads mouse ob dataset.

data.retina

Loads retina dataset.

data.prefrontalcortex_starmap

Loads a starMAP dataset of mouse pre-frontal cortex (Wang et al., 2018).

data.frontalcortex_dropseq

Load the cells from the mouse frontal cortex sequenced by the Dropseq technology (Saunders et al., 2018).