Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
4960935 | Procedia Computer Science | 2017 | 10 Pages |
Morph is a special type of fake alternative names. Internet users use morphs to achieve certain goals such as expressing special sentiment or avoiding censorship. For example, Chinese internet users often replace “é©¬æ¯æ¶” (Ma Jingtao) with “å宿䏻” (Roar Bishop)1. “å宿䏻” (Roar Bishop) is a morph and “é©¬æ¯æ¶” (Ma Jingtao) is the target entity of "å宿䏻" Roar Bishop . This paper focuses on morph resolution: given a morph, figure out the entity that it really refers to After analyse the common characteristic of morphs and target entities from cross-source corpora, we exploit temporal and semantic constraints to collect target candidates. We propose a framework based on character-word embeddings and radical-character-word embeddings to rank target candidates. Our method does not need any human-annotated data. Experimental results demonstrate our approaches outperforms the state-of-the-art method. The results also show that the performance is better when morphs share any character with target entities.