Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
6866279 | Neurocomputing | 2014 | 7 Pages |
Abstract
In this paper, we propose a biology-constrained gene expression discretization method based on class distribution diversity. Inspired by the intrinsic relationship between gene expression and gene regulation, we constrain gene expression discretization to be of at most three discrete states and locate cut points using a regulatory states-guided mechanism. To take advantage of class label information, we define class distribution diversity (CDD) for an interval and devise three supervised discretization rules. The proposed method is very cost-efficient and simple to implement in practice. In the experiments, we evaluated the proposed method using four publicly available gene expression datasets involving four types of cancer: leukemia, prostate, lymphoma and liver cancer, and compared with two previous methods, Fayyad and Irani׳s (FI) and EBD. The experimental results show the effectiveness and efficiency of the proposed method.
Keywords
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Hong-Qiang Wang, Gao-Jian Jing, Chunhou Zheng,