Article ID Journal Published Year Pages File Type
534844 Pattern Recognition Letters 2011 7 Pages PDF
Abstract

With the rapid evolution of the mobile environment, demand for information extraction from mobile devices is increasing. This paper proposes an information extraction system that is designed for mobile devices with limited hardware resources. The proposed system extracts temporal (dates and times) and named instances (locations and title) from Korean short messages in an appointment management domain. To efficiently extract temporal instances with limited numbers of surface forms, the proposed system uses well-refined finite state automata. To effectively extract various surface forms of named instances with limited hardware resources, the proposed system uses a modified hidden Markov model (HMM) based on character n-grams. In the experiment on instance boundary labeling, the proposed system showed comparable performances with representative conventional classifiers. The proposed system was implemented in a commercial mobile phone to test its ability to automatically extract appointment information from a short message and store the information into a schedule database. The system performed well with a reasonable response time.

Research highlights► The proposed system extracts temporal (dates and times) and named instances (locations and title) from Korean short messages in an appointment management domain. ► To effectively extract various surface forms of named instances with limited hardware resources, the proposed system uses a modified hidden Markov model (HMM) based on character n-grams. ► On an experimental basis, the proposed method is suitable for information extraction applications on mobile devices with limited computing resources.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , ,