Article ID Journal Published Year Pages File Type
469564 Computer Methods and Programs in Biomedicine 2009 9 Pages PDF
Abstract

Data mining, through its capacity to discover knowledge embedded in large databases to improve organizational decision-making, has the potential to contribute to efficiencies and cost savings in the increasingly costly healthcare industry. One important aspect of the methods of mining medical databases includes reducing dimensionality through feature selection. Traditionally feature selection is accomplished through stepwise regression, which tends to produce an unnecessarily high number of “significant” variables. This paper applies a filter-based feature selection method using inconsistency rate measure and discretization, to a medical claims database to predict the adequacy of duration of antidepressant medication utilization. Compared to traditional stepwise logistic regression, which selected seven variables from a total of nine potential explanatory variables to characterize patients with inadequate antidepressant medication utilization, the filter-based method selected two variables (age and number of claims) to achieve a similar prediction accuracy. This comparison suggests it may be feasible and efficient to apply the filter-based feature selection method to reduce the dimensionality of healthcare databases.

Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)
Authors
, , , ,