|کد مقاله||کد نشریه||سال انتشار||مقاله انگلیسی||ترجمه فارسی||نسخه تمام متن|
|483477||701411||2016||11 صفحه PDF||سفارش دهید||دانلود رایگان|
• A classification approach using fuzzy logic and extreme learning machine is proposed.
• In data preprocessing, instances with outliers are eliminated from the dataset.
• Missing values are imputed by the most frequent value of the 5 nearest neighbors.
• Trapezoidal membership function is applied to transform the clinical dataset to linguistic variables.
• Extreme Learning Machine (ELM) is used for training the single layer feed forward neural network.
Data mining techniques play a major role in developing computer aided diagnosis systems and expert systems that will aid a physician in clinical decision making. In this work, a classifier that combines the relative merits of fuzzy sets and extreme learning machine (FELM) for clinical datasets is proposed. The three major subsystems in the FELM framework are preprocessing subsystem, fuzzification subsystem and classification subsystem. Missing value imputation and outlier elimination are handled by the preprocessing subsystem. The fuzzification subsystem maps each feature to a fuzzy set and the classification subsystem uses extreme learning machine for classification.Cleveland heart disease (CHD), Statlog heart disease (SHD) and Pima Indian diabetes (PID) datasets from the University of California Irvine (UCI) machine learning repository have been used for experimentation. The CHD and SHD datasets have been experimented with two class labels one indicating the absence and the other indicating the presence of heart disease. The CHD dataset has also been experimented with five class labels, one class label indicating the absence of heart disease and the other four class labels indicating the severity of heart disease namely low risk, medium risk, high risk and serious. The PID data set has been experimented with two class labels one indicating the absence and the other indicating the presence of gestational diabetes.The classifier has achieved an accuracy of 93.55% for CHD data set with two class labels; 73.77% for CHD data set with five class labels; 94.44% for SHD data set and 92.54% for PID dataset.
Journal: Informatics in Medicine Unlocked - Volume 2, 2016, Pages 1–11