Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
11263686 | Computer Speech & Language | 2019 | 16 Pages |
Abstract
Standard approaches to named entity recognition (NER) are based on sequential labeling methods, such as conditional random fields (CRFs), which label each word in a sentence and extract entities from them that correspond to named entities. With the extensive deployment of deep learning methods for sequential labeling tasks, state-of-the-art NER performance has been achieved on long short-term memory (LSTM) architectures using only basic features. In this paper, we address Korean NER tasks and propose an extension of a bidirectional LSTM CRF by investigating character-based representation. Our extension involves deploying a hybrid representation using ConvNet and LSTM for the sequential modeling of characters, namely a character-based LSTM-ConvNet hybrid representation. Using morphemes as processing units for bidirectional LSTM, we apply a proposed hybrid representation composed of morpheme vectors. Experimental results showed that the proposed LSTM-ConvNet hybrid representation yielded improvements over each single representation on standard Korean NER tasks.
Related Topics
Physical Sciences and Engineering
Computer Science
Signal Processing
Authors
Seung-Hoon Na, Hyun Kim, Jinwoo Min, Kangil Kim,