کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
486706 703390 2012 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
HASCH: High Performance Automatic Spell Checker for Portuguese Texts from the Web
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر علوم کامپیوتر (عمومی)
پیش نمایش صفحه اول مقاله
HASCH: High Performance Automatic Spell Checker for Portuguese Texts from the Web
چکیده انگلیسی

The rise of the Web 2.0 caused a real democratization in the context of data generation. These data are mostly provided in the form of texts, ranging from the reports provided by news portals, using a formal language, to comments in blog and micro-blogging applications that abuse the use of an informal language. Address this heterogeneity is an essential preprocessing so that these data can be used by tools that aim to infer accurate information based on such data. Thus, this work presents the HASCH (High Performance Automatic Spell CHEcker), whose objective is to correct spelling in Portuguese texts collected from the Web. Being a tool that aims to handle a large volume of data, HASCH is completely parallelized in shared memory. In our evaluation, we found that the HASCH was extremely effective in the correction of very large texts from different Web sources, with a almost superlinear speedup.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Procedia Computer Science - Volume 9, 2012, Pages 403-411