Poster
Overview
Here, we describe the creation of the largest publicly available single cell dataset to date. Tahoe-100M encompasses 50 cell lines, 379 drugs and over 100M cells to set a new standard for modeling drug perturbations in diseased cells. In this first-of-a-kind study, mixed proprietary cell lines (Vevo Therapeutics) were co-cultured then treated for 24 hours with hundreds of cancer drugs at 3 concentrations. At the end of the treatment period, cells were dissociated and then fixed to allow for single cell barcoding in batches of more than 10 million cells each. Combinatorial barcoding was performed using Parse’s GigaLab platform, scaling the workflow for multi-million cell inputs and higher throughput. Libraries were converted for processing on the UG 100™ sequencer (Ultima Genomics). Sequencing output data from over 100M cells were processed using the Parse Biosciences pipeline, and cell line identity was deconvolved with SNP-based demultiplexing. This study, encompassing over 100M cells across more than 56,000 conditions, offers an invaluable resource for AI-driven biological research and drug discovery. It is expected to inspire the scientific community to pursue large-scale, multi-million cell single-cell sequencing experiments, which will ultimately help advance our understanding of disease-specific drug responses, with the end goal of improving upon and creating new therapies for cancer and other diseases.