Article ID Journal Published Year Pages File Type
403496 Knowledge-Based Systems 2015 11 Pages PDF
Abstract

•Selection of the best dictionary for Cross-Lingual Word Sense Disambiguation tasks.•Potential improvements offered by automatically built dictionaries in ideal systems.•Performance of different dictionaries on a particular unsupervised CLWSD system.•Approach for outperforming other systems participating in CLWSD tasks.

The choice of the dictionary that provides the possible translations a system has to choose when performing Cross-Lingual Word Sense Disambiguation (CLWSD) is one of the most important steps in such a task. In this work, we present a comparison between different dictionaries, in two different frameworks. First of all, a technique for analysing the potential results of an ideal system using those dictionaries is developed. The second framework considers the particular unsupervised CLWSD system CO-Graph, and analyses the results obtained when using different bilingual dictionaries providing the potential translations. Two different CLWSD tasks from the 2010 and 2013 SemEval competitions are used for evaluation, and statistics from the words in the test datasets of those competitions are studied. The conclusions of the analysis of dictionaries on a particular system lead us to a proposal that substantially improves the results obtained in that framework. In this proposal a hybrid system is developed, by combining the results provided by a probabilistic dictionary, and those obtained with a Most Frequent Sense (MFS) approach. The hybrid approach also outperforms the results obtained by other unsupervised systems in the considered competitions.

Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , ,