Differential Gene Expression in the Siphonophore Nanomia bijuga (Cnidaria) Assessed with Multiple Next-Generation Sequencing Workflows
- Author(s)
- Siebert, S; Robinson, MD; Tintori, SC; Goetz, F; Helm, RR; Smith, SA; Shaner, N; Haddock, SHD; Dunn, CW;
- Details
- Publication Year 2011-07-29,Volume 6,Issue #7,Page e0022953
- Journal Title
- PLOS ONE
- Publication Type
- Journal Article
- Abstract
- We investigated differential gene expression between functionally specialized feeding polyps and swimming medusae in the siphonophore Nanomia bijuga (Cnidaria) with a hybrid long-read/short-read sequencing strategy. We assembled a set of partial gene reference sequences from long-read data (Roche 454), and generated short-read sequences from replicated tissue samples that were mapped to the references to quantify expression. We collected and compared expression data with three short-read expression workflows that differ in sample preparation, sequencing technology, and mapping tools. These workflows were Illumina mRNA-Seq, which generates sequence reads from random locations along each transcript, and two tag-based approaches, SOLiD SAGE and Helicos DGE, which generate reads from particular tag sites. Differences in expression results across workflows were mostly due to the differential impact of missing data in the partial reference sequences. When all 454-derived gene reference sequences were considered, Illumina mRNA-Seq detected more than twice as many differentially expressed (DE) reference sequences as the tag-based workflows. This discrepancy was largely due to missing tag sites in the partial reference that led to false negatives in the tag-based workflows. When only the subset of reference sequences that unambiguously have tag sites was considered, we found broad congruence across workflows, and they all identified a similar set of DE sequences. Our results are promising in several regards for gene expression studies in non-model organisms. First, we demonstrate that a hybrid long-read/short-read sequencing strategy is an effective way to collect gene expression data when an annotated genome sequence is not available. Second, our replicated sampling indicates that expression profiles are highly consistent across field-collected animals in this case. Third, the impacts of partial reference sequences on the ability to detect DE can be mitigated through workflow choice and deeper reference sequencing.
- Publisher
- PUBLIC LIBRARY SCIENCE
- Keywords
- RNA-SEQ DATA; SERIAL ANALYSIS; TRANSCRIPTOME; QUANTIFICATION; ANNOTATION; BIOLOGY; BIAS
- Research Division(s)
- Bioinformatics
- Publisher's Version
- https://doi.org/10.1371/journal.pone.0022953
- Open Access at Publisher's Site
- http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3146525/
- Terms of Use/Rights Notice
- Copyright Siebert et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Creation Date: 2011-07-29 12:00:00