| Article ID | Journal | Published Year | Pages | File Type | 
|---|---|---|---|---|
| 377540 | Artificial Intelligence in Medicine | 2016 | 19 Pages | 
•We propose a method to support diagnosis in domains with time series data.•Temporal abstraction is used to capture relevant domain concepts in data.•A frequent pattern discovery technique is used to extract patient characteristics.•Patterns are used to characterize population groups by comparison with a control group.•Our method uses domain knowledge to improve the interpretability and acceptability of results.
IntroductionNumeric time series are present in a very wide range of domains, including many branches of medicine. Data mining techniques have proved to be useful for knowledge discovery in this type of data and for supporting decision-making processes.ObjectivesThe overall objective is to classify time series based on the discovery of frequent patterns. These patterns will be discovered in symbolic sequences obtained from the time series data by means of a temporal abstraction process.MethodsFirstly, we transform numeric time series into symbolic time sequences, where the symbols aim to represent the relevant domain concepts. These symbols can be defined using either public or expert domain knowledge. Then we apply a symbolic pattern discovery technique to the output symbolic sequences. This technique identifies the subsequences frequently found in a population group. These subsequences (patterns) are representative of population groups. Finally, we employ a classification technique based on the identified patterns in order to classify new individuals. Thanks to the inclusion of domain knowledge, the classification results can be explained using domain terminology. This makes the results easier to interpret for the domain specialist (physician).ResultsThis method has been applied to brainstem auditory evoked potentials (BAEPs) time series. Preliminary experiments were carried out to analyse several aspects of the method including the best configuration of the pattern discovery technique parameters. We then applied the method to the BAEPs of 83 individuals belonging to four classes (healthy, conductive hearing loss, vestibular schwannoma—brainstem involvement and vestibular schwannoma—8th-nerve involvement). According to the results of the cross-validation, overall accuracy was 99.4%, sensitivity (recall) was 97.6% and specificity was 100% (no false positives).ConclusionThe proposed method effectively reduces dimensionality. Additionally, if the symbolic transformation includes the right domain knowledge, the method arguably outputs a data representation that denotes the relevant domain concepts more clearly. The method is capable of finding patterns in BAEPs time series and is very accurate at correctly predicting whether or not new patients have an auditory-related disorder.
