Article ID Journal Published Year Pages File Type
6866279 Neurocomputing 2014 7 Pages PDF
Abstract
In this paper, we propose a biology-constrained gene expression discretization method based on class distribution diversity. Inspired by the intrinsic relationship between gene expression and gene regulation, we constrain gene expression discretization to be of at most three discrete states and locate cut points using a regulatory states-guided mechanism. To take advantage of class label information, we define class distribution diversity (CDD) for an interval and devise three supervised discretization rules. The proposed method is very cost-efficient and simple to implement in practice. In the experiments, we evaluated the proposed method using four publicly available gene expression datasets involving four types of cancer: leukemia, prostate, lymphoma and liver cancer, and compared with two previous methods, Fayyad and Irani׳s (FI) and EBD. The experimental results show the effectiveness and efficiency of the proposed method.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , ,