کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
515851 867114 2014 14 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Automatic thematic classification of election manifestos
ترجمه فارسی عنوان
طبقه بندی خودکار موضوعی منیفی های انتخاباتی
کلمات کلیدی
طبقه بندی متن، داده های سیاسی، ارزیابی کارشناس
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
چکیده انگلیسی


• We digitized political texts from the 1980s and 1990s.
• We used these data to learn a classifier that can label more recent political texts.
• Change of themes over the years affects recall of the learned classifier.
• But precision is comparable to the precision obtained by a human expert labeller.
• For political themes, a high level of detail seems to be preferred by domain experts.

We digitized three years of Dutch election manifestos annotated by the Dutch political scientist Isaac Lipschits. We used these data to train a classifier that can automatically label new, unseen election manifestos with themes. Having the manifestos in a uniform XML format with all paragraphs annotated with their themes has advantages for both electronic publishing of the data and diachronic comparative data analysis. The data that we created will be disclosed to the public through a search interface. This means that it will be possible to query the data and filter them on themes and parties. We optimized the Lipschits classifier on the task of classifying election manifestos using models trained on earlier years. We built a classifier that is suited for classifying election manifestos from 2002 onwards using the data from the 1980s and 1990s. We evaluated the results by having a domain expert manually assess a sample of the classified data. We found that our automatic classifier obtains the same precision as a human classifier on unseen data. Its recall could be improved by extending the set of themes with newly emerged themes. Thus when using old political texts to classify new texts, work is needed to link and expand the set of themes to newer topics.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 50, Issue 4, July 2014, Pages 554–567
نویسندگان
, , , ,