Cellsig plug-in enhances CIBERSORTx signature selection for multi-dataset transcriptomes with sparse multilevel modelling
- Author(s)
- Al Kamran Khan, MA; Wu, J; Yuhan, S; Barrow, AD; Papenfuss, AT; Mangiola, S;
- Details
- Publication Year 2023-11-11,Volume 39,Issue #12,Page btad685
- Journal Title
- Bioinformatics
- Abstract
- MOTIVATION: The precise characterisation of cell-type transcriptomes is pivotal to understanding cellular lineages, deconvolution of bulk transcriptomes, and clinical applications. Single-cell RNA sequencing resources like the Human Cell Atlas have revolutionised cell-type profiling. However, challenges persist due to data heterogeneity and discrepancies across different studies. One limitation of prevailing tools such as CIBERSORTx is their inability to address hierarchical data structures and handle non-overlapping gene sets across samples, relying on filtering or imputation. RESULTS: Here, we present cellsig, a Bayesian sparse multilevel model designed to improve signature estimation by adjusting data for multilevel effects and modelling for gene-set sparsity. Our model is tailored to large-scale, heterogeneous pseudobulk and bulk RNA sequencing data collections with non-overlapping gene sets. We tested the performances of cellsig on a novel curated Human Bulk Cell-type Catalogue, which harmonises 1,435 samples across 58 datasets. We show that cellsig significantly enhances cell-type marker gene ranking performance. This approach is valuable for cell-type signature selection, with implications for marker gene validation, single-cell annotation, and deconvolution benchmarks. AVAILABILITY: Codes and the interactive app are available at https://github.com/stemangiola/cellsig; and the database is available at https://doi.org/10.5281/zenodo.7582421. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
- Publisher
- Oxford Academic
- Keywords
- Humans; *Transcriptome; *Gene Expression Profiling; Bayes Theorem; Base Sequence; Sequence Analysis, RNA; Single-Cell Analysis
- Research Division(s)
- Bioinformatics
- PubMed ID
- 37952182
- Publisher's Version
- https://doi.org/10.1093/bioinformatics/btad685
- Open Access at Publisher's Site
- https://doi.org/10.1093/bioinformatics/btad685
- Terms of Use/Rights Notice
- Refer to copyright notice on published article.
Creation Date: 2023-11-20 03:32:47
Last Modified: 2023-12-13 10:19:53