Article ID Journal Published Year Pages File Type
562740 Biomedical Signal Processing and Control 2009 7 Pages PDF
Abstract

Data-mining methods can be used to generate rules, or identify patterns, from medical data to assist clinical diagnosis and decision-making. However, in the initial stages of a clinical study on a new diagnostic approach, there could be a limited medical dataset available; or the medical characteristics could mean that the number of patients involved in the study will never be large. Diagnoses made using the rules discovered from such small medical databases should be considered suspect unless a confidence range for a particular diagnosis can be established. A method to evaluate the sensitivity and reliability of data-mining with small databases is presented in this paper. Efron's bootstrap method for statistical testing was used to assess the accuracy of the rules produced during the training step of the data-mining algorithm. The case study for validating this new approach was based on a limited-sized mammographic database previously used to discover associations between the diagnostic features of breast masses in mammograms and the biopsy-based classification of the masses. Using the new approach, it was possible to distinguish between the association rules that were sensitive to the size of the training datasets from those that were not. The methods proposed should lead to an efficient way for validating the patterns discovered in medical data-mining applications using small datasets.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, , ,