Article ID Journal Published Year Pages File Type
4499833 Mathematical Biosciences 2016 7 Pages PDF
Abstract

•Identify the gastric cancer related genes by shortest path analysis based on the gene networks.•Select optimal GO terms and KEGG pathways that describe the characteristics of gastric cancer related genes well.•Predict novel gastric cancer related genes in addition to the previous related genes in expression analysis.•Discuss biomarkers for gastric cancer detection and achieve 100% accuracy.•Verify the gene expression of these predicted genes.•Show these novel predicted genes are quite consistent with previous literature.

Background/objectiveGastric cancer (GC) is the second leading cause of death resulted from cancer globally. The most common cause of GC is the infection of Helicobacter pylori, approximately 11% of cases are caused by genetic factors. The objective of this study was to develop an effective computational method to meaningfully interpret these GC-related genes and to predict potential prognostic genes for clinical detection.MethodsWe employed the shortest path algorithm and permutation test to probe the genes that have relationship with known GC genes in gene–gene interaction network. We calculated the enrichment scores of gene ontology and pathways of gastric cancer related genes to characterize these genes in terms of molecular features. The optimal features that primly representing the gastric cancer related genes were selected using Random Forest classification and incremental feature selection. Random Forest classification was also used for the prediction of the novel gastric cancer related genes based on the selected features and the identification of novel prognostic genes based on the expression of genes.ResultsBased on the shortest path analysis of 36 known GC genes, 39 genes occurring in shortest path were identified as GC-related genes. In subsequent classification, 4153 gene ontology terms and 157 pathway terms were identified as the optimal features to depict these gastric cancer related genes. Based on them, a total of 886 genes were predicted as related genes. These 886 genes could serve as expression biomarkers for clinical detection and they achieved a 100% accuracy for distinguishing gastric cancer from a case-control dataset, better than any of 886 random selected genes did.ConclusionBy analyzing the features of known GC-related genes, we employed a systematic method to predict gastric cancer related genes and novel prognostic genes for accurate clinical detection.

Related Topics
Life Sciences Agricultural and Biological Sciences Agricultural and Biological Sciences (General)
Authors
, , , , , ,