scMerge leverages factor analysis, stable expression, and pseudoreplication to merge multiple single-cell RNA-seq datasets
- Author(s)
- Lin, Y; Ghazanfar, S; Wang, KYX; Gagnon-Bartsch, JA; Lo, KK; Su, X; Han, ZG; Ormerod, JT; Speed, TP; Yang, P; Yang, JYH;
- Journal Title
- Proceedings of the National Academy of Sciences of the United States of America
- Publication Type
- Journal Article in press
- Abstract
- Concerted examination of multiple collections of single-cell RNA sequencing (RNA-seq) data promises further biological insights that cannot be uncovered with individual datasets. Here we present scMerge, an algorithm that integrates multiple single-cell RNA-seq datasets using factor analysis of stably expressed genes and pseudoreplicates across datasets. Using a large collection of public datasets, we benchmark scMerge against published methods and demonstrate that it consistently provides improved cell type separation by removing unwanted factors; scMerge can also enhance biological discovery through robust data integration, which we show through the inference of development trajectory in a liver dataset collection.
- Publisher
- National Acacemy of Sciences
- Research Division(s)
- Bioinformatics
- PubMed ID
- 31028141
- Publisher's Version
- https://doi.org/10.1073/pnas.1820006116
- NHMRC Grants
- NHMRC/1054618,
- Terms of Use/Rights Notice
- Refer to copyright notice on published article.
Creation Date: 2019-04-30 11:47:39
Last Modified: 2019-04-30 12:15:23