دانلود رایگان مقاله: ترکیبی از مفاهیم شباهت هایی به منظور شناسایی پارافرز ها

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
4973639	1451680	2018	15 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Combining sentence similarities measures to identify paraphrases

ترجمه فارسی عنوان

ترکیبی از مفاهیم شباهت هایی به منظور شناسایی پارافرز ها

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

شباهت جمله شناسایی پارافرز؛ ساده سازی احکام؛ مدل مبتنی بر گراف

Graph-based model

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر پردازش سیگنال

پیش نمایش مقاله

ترکیبی از مفاهیم شباهت هایی به منظور شناسایی پارافرز ها

چکیده انگلیسی

- It proposes a new paraphrase identification system based on lexical, syntactic, semantic analysis.
- It uses different machine learning algorithms to classify the paraphrase.
- The measure was evaluated using state-of-art dataset: Microsoft Paraphrase Corpus.

Paraphrase identification consists in the process of verifying if two sentences are semantically equivalent or not. It is applied in many natural language tasks, such as text summarization, information retrieval, text categorization, and machine translation. In general, methods for assessing paraphrase identification perform three steps. First, they represent sentences as vectors using bag of words or syntactic information of the words present the sentence. Next, this representation is used to measure different similarities between two sentences. In the third step, these similarities are given as input to a machine learning algorithm that classifies these two sentences as paraphrase or not. However, two important problems in the area of paraphrase identification are not handled: (i) the meaning problem: two sentences sharing the same meaning, composed of different words; and (ii) the word order problem: the order of the words in the sentences may change the meaning of the text. This paper proposes a paraphrase identification system that represents each pair of sentence as a combination of different similarity measures. These measures extract lexical, syntactic and semantic components of the sentences encompassed in a graph. The proposed method was benchmarked using the Microsoft Paraphrase Corpus, which is the publicly available standard dataset for the task. Different machine learning algorithms were applied to classify a sentence pair as paraphrase or not. The results show that the proposed method outperforms state-of-the-art systems.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Speech & Language - Volume 47, January 2018, Pages 59-73

نویسندگان

Rafael Ferreira, George D.C. Cavalcanti, Fred Freitas, Rafael Dueire Lins, Steven J. Simske, Marcelo Riss,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : ترکیبی از مفاهیم شباهت هایی به منظور شناسایی پارافرز ها

دسترسی سریع

ارتباط

English Website