کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
487475 703573 2015 10 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Efficient and Accurate Discovery of Colossal Pattern Sequences from Biological Datasets: A Doubleton Pattern Mining Strategy (DPMine)
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
پیش نمایش صفحه اول مقاله
Efficient and Accurate Discovery of Colossal Pattern Sequences from Biological Datasets: A Doubleton Pattern Mining Strategy (DPMine)
چکیده انگلیسی

In this paper, DPMine, a new approach for discovering large Colossal Pattern Sequences from Biological datasets is discussed. DPMine effectively discovers Doubleton Patterns which are further enriched into DPT+ tree to generate colossal pattern sequenceswith vector intersection operator. DPMine makes use of a new integrated data structure called ‘D-struct’, as combination of a doubleton data matrix and one dimensional array pair set to dynamically discover Doubleton Patterns from Biological datasets.DPT+ tree is constructed as Bitwise Top down Column enumeration tree. D-struct has a diverse feature to facilitate is, it hasextremely limited and accurately predictable main memory and runs very quickly in memory based constraints. The algorithm is designed in such a way that it takes only one scan over the database to discover large colossal pattern sequences. The empirical analysis on DPMine shows that, the proposed approach attains a better mining efficiency on various Biological datasets and outperforms Colossal Pattern Miner (CPM) and BVBUC in different settings. The performance of DPMine on Biological data set is also assessed with Accuracy and F-measure.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 54, 2015, Pages 412-421