Poster
Overview
In the fifteen years since its inception, single cell RNA-sequencing (scRNA-seq) has rapidly spread across multiple fields of research, leading to many new discoveries. As technologies have matured, the number of cells that can be processed in a single experiment has seen exponential growth with workflows now assaying up to one million cells in an individual experiment.
While high throughput sequencing methods have facilitated the discovery and characterization of various cell types, sequencing costs can be prohibitively high for routine use. Many applications of scRNA-seq are focused on cell type identification, gene regulatory networks, or biomarker discovery. These applications often do not require surveying the entire transcriptome, but rather require the interrogation of specific sets of well-characterized genes. In these cases, sequencing the entire transcriptome may be adding unnecessary project costs. To increase throughput and minimize sequencing costs, the development of a targeted gene enrichment method is required.
Here we extend our whole transcriptome split-pool combinatorial barcoding technology to enable enrichment of a subset of genes in the final single cell sequencing libraries. To illustrate the power of our technology, we enriched a whole transcriptome library of one million peripheral blood mononuclear cells (PBMCs) from twelve donors. We used our immune gene panel to enrich 1,000 genes representing canonical immune cell markers and pathways. Our method increased the percent of reads on target from 8% in the whole transcriptome libraries to 60% in the enriched libraries. Furthermore, despite a greater than ten-fold reduction in sequencing reads between unenriched and enriched libraries, the resulting clustering yielded very high concordance of cell type identities.
Overall, we demonstrate our modular enrichment strategy preserves the biological structure of the data. We envision our approach will enable researchers to simultaneously reduce sequencing costs while drastically scaling up the number of cells and samples across experiments.