Flexiplex: a versatile demultiplexer and search tool for omics data
- Author(s)
- Cheng, O; Ling, MH; Wang, C; Wu, S; Ritchie, ME; Göke, J; Amin, N; Davidson, NM;
- Journal Title
- Bioinformatics
- Abstract
- MOTIVATION: The process of analyzing high throughput sequencing data often requires the identification and extraction of specific target sequences. This could include tasks such as identifying cellular barcodes and UMIs in single cell data, and specific genetic variants for genotyping. However, existing tools which perform these functions are often task-specific, such as only demultiplexing barcodes for a dedicated type of experiment, or are not tolerant to noise in the sequencing data. RESULTS: To overcome these limitations, we developed Flexiplex, a versatile and fast sequence searching and demultiplexing tool for omics data, which is based on the Levenshtein distance and thus allows imperfect matches. We demonstrate Flexiplex's application on three use cases, identifying cell line specific sequences in Illumina short-read single cell data, and discovering and demultiplexing cellular barcodes from noisy long-read single cell RNA-seq data. We show that Flexiplex achieves an excellent balance of accuracy and computational efficiency compared to leading task-specific tools. AVAILABILITY: Flexiplex is available at https://davidsongroup.github.io/flexiplex/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
- Publisher
- Oxford Academic
- Keywords
- *Software; Sequence Analysis, DNA; *Search Engine; High-Throughput Nucleotide Sequencing; Electronic Data Processing
- Research Division(s)
- Blood Cells And Blood Cancer; Epigenetics And Development; Blood Cells and Blood Cancer
- PubMed ID
- 38379414
- Publisher's Version
- https://doi.org/10.1093/bioinformatics/btae102
- Open Access at Publisher's Site
- https://doi.org/10.1093/bioinformatics/btae10
- Terms of Use/Rights Notice
- Refer to copyright notice on published article.
Creation Date: 2024-02-29 09:16:08
Last Modified: 2024-03-11 09:39:15