کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
530013 869729 2015 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Efficient segmentation-free keyword spotting in historical document collections
ترجمه فارسی عنوان
علامت گذاری کلمات کلیدی بدون تقسیم کارآمد در مجموعه های تاریخی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی


• We present a query-by-example keyword spotting method for historical collections.
• The method is segmentation-free and avoids any pre-processing step.
• We use a compact and efficient vectorial representation to index large collections.
• We outperform the recent state-of-the-art keyword spotting approaches.

In this paper we present an efficient segmentation-free word spotting method, applied in the context of historical document collections, that follows the query-by-example paradigm. We use a patch-based framework where local patches are described by a bag-of-visual-words model powered by SIFT descriptors. By projecting the patch descriptors to a topic space with the latent semantic analysis technique and compressing the descriptors with the product quantization method, we are able to efficiently index the document information both in terms of memory and time. The proposed method is evaluated using four different collections of historical documents achieving good performances on both handwritten and typewritten scenarios. The yielded performances outperform the recent state-of-the-art keyword spotting approaches.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition - Volume 48, Issue 2, February 2015, Pages 545–555
نویسندگان
, , , ,