Plant Transcriptome Assembly: Review and Benchmarking

Main Article Content

Sairam Behera
Adam Voshall
Etsuko Moriyama


Transcriptome assembly using next-generation sequencing data is an important step in a wide range of biological studies at the molecular level. The quality of computationally assembled transcriptomes affects various downstream analyses, such as gene structure prediction, isoform identification, and gene expression analysis. However, the actual accuracy of assembled transcriptomes is usually unknown. Furthermore, assembly quality depends on various factors such as the method used, the parameters (for example, k-mers) used with the method, and the transcript to be assembled. Users often choose an assembly method based solely on availability without considering differences among methods, as well as choices of the parameters. This is partly due to the lack of suitable benchmarking datasets. In this chapter, we provide a review of computational approaches used for transcriptome assembly (genome-guided, de novo, and ensemble), factors that affect assembly performance including those particularly important for plant transcriptomes, and how the transcriptome assembly performance can be assessed. Using examples from plant transcriptomes, we further illustrate how simulated benchmark datasets can be generated and used to compare the quality of transcriptome assemblies and how the performance of transcriptome assemblers can be assessed using various metrics.


Download data is not yet available.


Metrics Loading ...

Article Details

Chapter 7