کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
534868 | 870297 | 2011 | 5 صفحه PDF | دانلود رایگان |

Computed tomographic (CT) colonography is a promising alternative to traditional invasive colonoscopic methods used in the detection and removal of cancerous growths, or polyps in the colon. Existing computer-aided diagnosis (CAD) algorithms used in CT colonography typically employ the use of a classifier to discriminate between true and false positives generated by a polyp candidate detection system based on a set of features extracted from the candidates. However, these classifiers often suffer from a phenomenon termed the curse of dimensionality, whereby there is a marked degradation in the performance of a classifier as the number of features used in the classifier is increased. In addition, an increase in the number of features used also contributes to an increase in computational complexity and demands on storage space.This paper investigates the benefits of feature selection on a polyp candidate database, with the aim of increasing specificity while preserving sensitivity. Two new mutual information methods for feature selection are proposed in order to select a subset of features for optimum performance. Initial results show that the performance of the widely used support vector machine (SVM) classifier is indeed better with the use of a small set of features, with receiver operating characteristic curve (AUC) measures reaching 0.78–0.88.
Research highlights
► The benefits of feature selection on a polyp candidate database, with the aim of increasing specificity while preserving sensitivity via mutual information methods.
► Typically, only the mutual information between features f and s is measured without taking into the account the objective of predicting the class C. Therefore we introduce a new information measure - I(C;f∣s) to take this into account. We also investigate the use of the information provided by s and f in terms of predicting C by using I(s;C∣f) and I(f;C∣s).
► Furthermore, as there is no direct method to estimate the regularization parameter β typically used in mutual information methods and finding an optimum β may be computationally expensive, we propose to construct a new mutual information algorithm that eliminates the use of a β parameter.
► Initial results show that the performance of the widely used support vector machine (SVM) classifier is indeed better with the use of a small set of features, with receiver operating characteristic curve (AUC) measures reaching 0.78–0.88.
Journal: Pattern Recognition Letters - Volume 32, Issue 2, 15 January 2011, Pages 337–341