A comprehensive evaluation of long-read de novo transcriptome assembly
Journal Title
Genome Biology
Publication Type
Feb 18
Abstract
INTRODUCTION: Recently, de novo transcriptome assembly methods have been developed to utilise long-read data in cases where a reference genome is unavailable, such as in non-model organisms. Despite the potential of these tools, there remains a lack of benchmarking and established protocols for optimal reference-free, long-read transcriptome assembly and differential expression analysis. RESULTS: Here, we evaluate the long-read de novo transcriptome assembly tools, RATTLE, RNA-Bloom2 and isONform, and compare their performance to one of the leading short-read assemblers, Trinity. We assess various metrics across a range of datasets, which include simulated data and spike-in sequin transcripts, where ground truth is known, and real data from human and pea (Pisum sativum) samples, using a reference-based approach to define truth. To represent contemporary analysis scenarios, the datasets cover depths from 6 to 60 million reads, Oxford Nanopore Technologies (ONT) cDNA, ONT direct RNA and Pacific Biosciences (PacBio) 10 × single-cell sequencing. Critically, we assess the downstream impact of assembly choice on the detection of differential gene and transcript expression. CONCLUSIONS: Our results confirm that long reads generate longer assembled transcripts than short-reads for reference-free analysis, though limitations remain compared to reference-guided approaches, and suggest scope for improved accuracy and reduced redundancy. Of the de novo pipelines, RNA-Bloom2, coupled with Corset for transcript clustering, was the best performing in terms of both accuracy and computational efficiency. Our findings offer guidance when selecting the most effective strategy for long-read differential expression analysis, when a high-quality reference genome is unavailable.
Publisher
Springer Nature
Keywords
Differential expression; Long reads; Non-model organisms; Reference-free; Transcriptome assembly
Research Division(s)
Bioinformatics and Computational Biology; Genetics and Gene Regulation; Blood Cells and Blood Cancer
PubMed ID
41709347
Open Access at Publisher's Site
https://doi.org/10.1186/s13059-026-04001-5
Terms of Use/Rights Notice
Refer to copyright notice on published article.


Creation Date: 2026-03-24 02:09:50
Last Modified: 2026-03-24 02:16:41
An error has occurred. This application may no longer respond until reloaded. Reload 🗙