Article ID Journal Published Year Pages File Type
533272 Pattern Recognition 2014 13 Pages PDF
Abstract

•We present a minimum-risk (MR) training method for semi-CRFs.•MR aims at minimizing the character error rate rather than the string error rate.•Three non-0/1 cost functions are compared with the conventional 0/1 cost.•Lattice edge selection is investigated in MR to reduce the training complexity.•Six learning criteria are evaluated on handwritten Chinese text recognition tasks.

Semi-Markov conditional random fields (semi-CRFs) are usually trained with maximum a posteriori (MAP) criterion which adopts the 0/1 cost for measuring the loss of misclassification. In this paper, based on our previous work on handwritten Chinese/Japanese text recognition (HCTR) using semi-CRFs, we propose an alternative parameter learning method by minimizing the risk on the training set, which has unequal misclassification costs depending on the hypothesis and the ground-truth. Based on this framework, three non-uniform cost functions are compared with the conventional 0/1 cost, and training data selection is incorporated to reduce the computational complexity. In experiments of online handwriting recognition on databases CASIA-OLHWDB and TUAT Kondate, we compared the performances of the proposed method with several widely used learning criteria, including conditional log-likelihood (CLL), softmax-margin (SMM), minimum classification error (MCE), large-margin MCE (LM-MCE) and max-margin (MM). On the test set (online handwritten texts) of ICDAR 2011 Chinese handwriting recognition competition, the proposed method outperforms the best system in competition.

Related Topics
Physical Sciences and Engineering Computer Science Computer Vision and Pattern Recognition
Authors
, , , , ,