Article ID Journal Published Year Pages File Type
558284 Computer Speech & Language 2014 13 Pages PDF
Abstract

•A manual annotation study shows that approximately 90% of the senses retain their subjectivity across languages.•Subjectivity labeling discrepancy is caused by differences between languages in sense usage and lexicon granularity.•Subjectivity information from multiple languages can be employed jointly to label new senses in a given language.•Bootstrapping multilingual sense level subjectivity labeling outperforms monolingual learning.

Recent research on English word sense subjectivity has shown that the subjective aspect of an entity is a characteristic that is better delineated at the sense level, instead of the traditional word level. In this paper, we seek to explore whether senses aligned across languages exhibit this trait consistently, and if this is the case, we investigate how this property can be leveraged in an automatic fashion. We first conduct a manual annotation study to gauge whether the subjectivity trait of a sense can be robustly transferred across language boundaries. An automatic framework is then introduced that is able to predict subjectivity labeling for unseen senses using either cross-lingual or multilingual training enhanced with bootstrapping. We show that the multilingual model consistently outperforms the cross-lingual one, with an accuracy of over 73% across all iterations.

Related Topics
Physical Sciences and Engineering Computer Science Signal Processing
Authors
, , ,