Main Article Content
Researchers in biological sciences and genetics are faced with high-dimensional data, such as the microarray data, and the analysis and proper interpretation of these data are very important in bioinformatics and systems biological sciences. In such types of data, the number of variables, for example, the genes, is many times greater than the number of samples. Therefore, the dimension of the data must be reduced at the primary point. Then, the analysis, for example, clustering, is performed on the compacted data. This process is called data summarization. There are various ways to summarize high-dimensional data, which depends on the nature of the data. The aim of data summarization is to remove unnecessary features so that the data are classified more accurately. Shannon’s entropy information is a common method for clustering genes in microarray data and selecting a set of disease-related genes. This chapter introduces and illustrates statistical inference concepts of entropy in microarray data clustering to select a set of the most important genes associated with a disease.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Copyright of individual chapters belongs to the respective authors. The authors grant unrestricted publishing and distribution rights to the publisher. The electronic versions of the chapters are published under Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). Users are allowed to share and adapt the chapters for any non-commercial purposes as long as the authors and the publisher are explicitly identified and properly acknowledged as the original source. The books in their entirety are subject to copyright by the publisher. The reproduction, modification, republication and display of the books in their entirety, in any form, by anyone, for commercial purposes are strictly prohibited without the written consent of the publisher.