کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
552532 1451085 2014 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Improving learning accuracy by using synthetic samples for small datasets with non-linear attribute dependency
ترجمه فارسی عنوان
بهبود دقت یادگیری با استفاده از نمونه های مصنوعی برای مجموعه داده های کوچک با وابستگی به ویژگی های غیر خطی
کلمات کلیدی
مجموعه داده های کوچک، وابستگی وابستگی، نمونه های مجازی مرتبط برنامه نویسی بیان ژن، انتشار مگا روند
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر سیستم های اطلاعاتی
چکیده انگلیسی


• We construct a relational model between attributes concerning attribute dependency.
• We generate related virtual samples to improve small dataset learning.
• One practical data and three UCI datasets are provided in the experiments.
• The results show that the proposed method has better prediction performance.

Small-data problems are commonly encountered in the early stages of a new manufacturing procedure, presenting challenges to both academics and practitioners, as good performance is difficult to achieve with learning models when there is a lack of sufficient data. Virtual sample generation (VSG) has been shown to be an effective method to overcome this issue in a wide range of studies in various fields. Such works usually assume that the relations among attributes are independent of each other, and produce synthetic data by using sample distributions of these. However, the VSG technique may be ineffective if the real data has interrelated attributes. Therefore, this research provides a novel procedure to generate related virtual samples with non-linear attribute dependency. To construct a relational model between the independent and dependent attributes, we employ gene expression programming (GEP) to find the most suitable mathematical model. One practical dataset and three real UCI datasets are presented in this paper to verify the effectiveness of the proposed method, and the results show that the proposed approach has better learning accuracy with regard to a back-propagation neural (BPN) network than that of the well-known mega-trend-diffusion (MTD) and the multi regression analysis (MRA) approaches.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Decision Support Systems - Volume 59, March 2014, Pages 286–295
نویسندگان
, , ,