Sense-level subjectivity in a multilingual setting

Article ID	Journal	Published Year	Pages	File Type
558284	Computer Speech & Language	2014	13 Pages	PDF

Abstract

•A manual annotation study shows that approximately 90% of the senses retain their subjectivity across languages.•Subjectivity labeling discrepancy is caused by differences between languages in sense usage and lexicon granularity.•Subjectivity information from multiple languages can be employed jointly to label new senses in a given language.•Bootstrapping multilingual sense level subjectivity labeling outperforms monolingual learning.

Recent research on English word sense subjectivity has shown that the subjective aspect of an entity is a characteristic that is better delineated at the sense level, instead of the traditional word level. In this paper, we seek to explore whether senses aligned across languages exhibit this trait consistently, and if this is the case, we investigate how this property can be leveraged in an automatic fashion. We first conduct a manual annotation study to gauge whether the subjectivity trait of a sense can be robustly transferred across language boundaries. An automatic framework is then introduced that is able to predict subjectivity labeling for unseen senses using either cross-lingual or multilingual training enhanced with bootstrapping. We show that the multilingual model consistently outperforms the cross-lingual one, with an accuracy of over 73% across all iterations.