Article ID Journal Published Year Pages File Type
5907734 Genomics 2014 6 Pages PDF
Abstract

•The statistical significance of gene expression is not associated with the predictability.•Insignificant genes can improve the prediction accuracy by combining with significant genes.•Identified combined gene set showed higher performance comparing to previous study.•The identified combined gene sets can be used for accurate diagnosis of oral cancer.•The identified gene sets can be a target for biological pathway study.

Trends in genetics are transforming in order to identify differential coexpressions of correlated gene expression rather than the significant individual gene. Moreover, it is known that a combined biomarker pattern improves the discrimination of a specific cancer. The identification of the combined biomarker is also necessary for the early detection of invasive oral squamous cell carcinoma (OSCC). To identify the combined biomarker that could improve the discrimination of OSCC, we explored an appropriate number of genes in a combined gene set in order to attain the highest level of accuracy. After detecting a significant gene set, including the pre-defined number of genes, a combined expression was identified using the weights of genes in a gene set. We used the Principal Component Analysis (PCA) for the weight calculation. In this process, we used three public microarray datasets. One dataset was used for identifying the combined biomarker, and the other two datasets were used for validation. The discrimination accuracy was measured by the out-of-bag (OOB) error. There was no relation between the significance and the discrimination accuracy in each individual gene. The identified gene set included both significant and insignificant genes. One of the most significant gene sets in the classification of normal and OSCC included MMP1, SOCS3 and ACOX1. Furthermore, in the case of oral dysplasia and OSCC discrimination, two combined biomarkers were identified. The combined expression revealed good performance in the validation datasets. The combined genomic expression achieved better performance in the discrimination of different conditions than a single significant gene. Therefore, it could be expected that accurate diagnosis for cancer could be possible with a combined biomarker.

Related Topics
Life Sciences Biochemistry, Genetics and Molecular Biology Genetics