Discovering high quality answers in community question answering archives using a hierarchy of classifiers

Article ID	Journal	Published Year	Pages	File Type
393885	Information Sciences	2014	15 Pages	PDF

Abstract

In community-based question answering (CQA) services where answers are generated by human, users may expect better answers than an automatic question answering system. However, in some cases, the user generated answers provided by CQA archives are not always of high quality. Most existing works on answer quality prediction use the same model for all answers, despite the fact that each answer is intrinsically different. However, modeling each individual QA pair differently is not feasible in practice. To balance between efficiency and accuracy, we propose a hybrid hierarchy-of-classifiers framework to model the QA pairs. First, we analyze the question type to guide the selection of the right answer quality model. Second, we use the information from question analysis to predict the expected answer features and train the type-based quality classifiers to hierarchically aggregate an overall answer quality score. We also propose a number of novel features that are effective in distinguishing the quality of answers. We tested the framework on a dataset of about 50 thousand QA pairs from Yahoo! Answer. The results show that our proposed framework is effective in identifying high quality answers. Moreover, further analysis reveals the ability of our framework to classify low quality answers more accurately than a single classifier approach.

Keywords

Question answering system User generated content