دانلود رایگان مقاله: ایرلندی: یک مدل مخفی مارکف برای شناسایی جزایر اطلاعات کد شده در متن آزاد

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
433225	1441648	2015	18 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Irish: A Hidden Markov Model to detect coded information islands in free text

ترجمه فارسی عنوان

ایرلندی: یک مدل مخفی مارکف برای شناسایی جزایر اطلاعات کد شده در متن آزاد

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

مدل مارکف مخفی، استخراج داده های بدون ساختار، ارتباطات توسعه دهندگان

Hidden Markov models - مدل پنهان مارکوف

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات

پیش نمایش مقاله

ایرلندی: یک مدل مخفی مارکف برای شناسایی جزایر اطلاعات کد شده در متن آزاد

چکیده انگلیسی

• We propose a novel approach named IRISH (InfoRmation ISlands Hmm) to identify coded information in unstructured data sources.
• IRISH is based on Hidden Markov Models.
• We compare IRISH with the use of island parsers.
• IRISH achieves performances close to those of island parser, however it is easier to be applied for non-skilled people.

Developers' communication, as contained in emails, issue trackers, and forums, is a precious source of information to support the development process. For example, it can be used to capture knowledge about development practice or about a software project itself. Thus, extracting the content of developers' communication can be useful to support several software engineering tasks, such as program comprehension, source code analysis, and software analytics. However, automating the extraction process is challenging, due to the unstructured nature of free text, which mixes different coding languages (e.g., source code, stack dumps, and log traces) with natural language parts.We conduct an extensive evaluation of Irish (InfoRmation ISlands Hmm), an approach we proposed to extract islands of coded information from free text at token granularity, with respect to the state of art approaches based on island parsing or island parsing combined with machine learners. The evaluation considers a wide set of natural language documents (e.g., textbooks, forum discussions, and development emails) taken from different contexts and encompassing different coding languages. Results indicate an F-measure of Irish between 74% and 99%; this is in line with existing approaches which, differently from Irish, require specific expertise for the definition of regular expressions or grammars.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Science of Computer Programming - Volume 105, 1 July 2015, Pages 26–43

نویسندگان

Luigi Cerulo, Massimiliano Di Penta, Alberto Bacchelli, Michele Ceccarelli, Gerardo Canfora,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : ایرلندی: یک مدل مخفی مارکف برای شناسایی جزایر اطلاعات کد شده در متن آزاد

دسترسی سریع

ارتباط

English Website