Text Mining Gene Selection to Understand Pathological Phenotype Using Biological Big Data

Main Article Content

Christophe Desterke
Hans-Kristian Lorenzo
Jean-Jacques Candelier


Whole transcriptome omics experiments allow for the study of gene regulation at the cellular level. During analysis and interpretation of omics data, false discovery can occur. To minimize false discovery and identify true significant cases, multi-test correction has been introduced to bioinformatics algorithms. The scientific literature offers a huge collection of information that can be parsed using a web Application Programming Interface. Gene selection by text mining can rank information according to its importance while taking into account the most recent updates in scientific literature. The integration of text mining selection in biological big data, such as transcriptome experiments including single cell transcriptome, can achieve an important dimensional reduction of the data without any statistical hypothesis. This avoids false discoveries regarding the molecules of interest. Hydatidiform moles and focal segmental glomerulosclerosis (FSGS) nephropathy are the two examples presented in this chapter, which demonstrate the considerable value of these analytical methods to prove the concept. The best FSGS markers expressed can be displayed by building an interactive online web interface as a web resource based on the glomerular cell transcriptome. This chapter shows the value of integrating text mining with omics data analysis to discover specific molecules and determine their locations and functions associated with complex diseases.


Download data is not yet available.


Metrics Loading ...

Article Details

Chapter 1