کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
4956798 | 1364710 | 2016 | 9 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Leveraging MapReduce to efficiently extract associations between biomedical concepts from large text data
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی کامپیوتر
شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
![عکس صفحه اول مقاله: Leveraging MapReduce to efficiently extract associations between biomedical concepts from large text data Leveraging MapReduce to efficiently extract associations between biomedical concepts from large text data](/preview/png/4956798.png)
چکیده انگلیسی
Large biomedical text data represents an important source of information that not only enables researchers to discover in-depth knowledge about biological systems, but also helps healthcare professionals do evidence-based medicine in clinical settings. However, investigating and analyzing these data is often both data-intensive and computation-intensive. In this paper, we investigate how to use MapReduce, a parallel and distributed programming paradigm, to efficiently mine the associations between biomedical concepts extracted from a large set of biomedical articles. First, biomedical concepts were obtained by matching text to Unified Medical Language System (UMLS) Metathesaurus, a biomedical vocabulary and standard database. Then we developed a MapReduce algorithm that could be used to calculate a category of interestingness measures defined on the basis of a 2Â ÃÂ 2 contingency table. This algorithm consists of two MapReduce jobs and takes a stripes approach to reduce the number of intermediate results. Experiments were conducted using Amazon Elastic MapReduce (EMR) with an input of 33,960 articles from TREC (Text REtrieval Conference) 2006 Genomics Track. Performance test indicated that our algorithm had approximately linear scalability and was more efficient than a ''pairs'' approach in the literature. The physician in our project team evaluated a subset of the association mining results related to drug-disease treatment and found that meaningful association rules ranked high.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Microprocessors and Microsystems - Volume 46, Part B, October 2016, Pages 202-210
Journal: Microprocessors and Microsystems - Volume 46, Part B, October 2016, Pages 202-210
نویسندگان
Yanqing Ji, Yun Tian, Fangyang Shen, John Tran,