AQA-WebCorp: Web-based Factual Questions for Arabic

Article ID	Journal	Published Year	Pages	File Type
4961827	Procedia Computer Science	2016	10 Pages	PDF

Abstract

Working with corpus construction becomes an interesting alternative to different applications of natural language processing, such as, question-answering, machine translation, information retrieval, etc. Similarly, with the heterogeneous data and the user demands for the accurate information, many studies have accentuated the need of the Web to highlight the corpus construction. As well as, Arabic doesn't have an equivalent number of linguistic corpuses as compared to other languages like English. In this paper, we focus on building our corpus of Arab questions-texts. We present a method for recovering text passages. This method is based on a real automatic interrogation of Google, in order to generate passages of texts and answer the factual questions. The first part of this paper describes the formal details about this method; the second part presents some experiments and results that validate our method.

Keywords

Question analysis Arabic passage corpus Google