کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
488506 703898 2016 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Hadoop Framework For Entity Resolution Within High Velocity Streams
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
پیش نمایش صفحه اول مقاله
Hadoop Framework For Entity Resolution Within High Velocity Streams
چکیده انگلیسی

Large amount of data is being generated from sensors, satellites, social media etc. This big data (velocity, variety, veracity, value and veracity) can be processed so as to make timely decisions by the decision makers. This paper presents results of the proposed Hadoop framework that performs entity resolution in Map and reduce phase. MapReduce phase matches two real world objects and generates rules. The similarity score of these rules are used for matching stream data during testing phase. Similarity is calculated using 13 different semantic measures such as token-based similarity, edit-based similarity, hybrid similarity, phonetic similarity as well as domain dependent Natural language processing measures. Semantic measures are implemented using Hive programming. The proposed system is tested using e-catalogues of Amazon and Google.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 85, 2016, Pages 550–557
نویسندگان
, , ,