کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
558272 | 874889 | 2014 | 21 صفحه PDF | دانلود رایگان |
• Acoustic feature similarity between spoken segments can improve spoken term detection.
• Pseudo-relevance feedback and graph-based re-ranking approach using acoustic feature similarity are proposed.
• A generalized framework for these approaches is presented.
• Significant improvements with both in-vocabulary and out-of-vocabulary queries were observed.
Spoken content retrieval will be very important for retrieving and browsing multimedia content over the Internet, and spoken term detection (STD) is one of the key technologies for spoken content retrieval. In this paper, we show acoustic feature similarity between spoken segments used with pseudo-relevance feedback and graph-based re-ranking can improve the performance of STD. This is based on the concept that spoken segments similar in acoustic feature vector sequences to those with higher/lower relevance scores should have higher/lower scores, while graph-based re-ranking further uses a graph to consider the similarity structure among all the segments retrieved in the first pass. These approaches are formulated on both word and subword lattices, and a complete framework of using them in open vocabulary retrieval of spoken content is presented. Significant improvements for these approaches with both in-vocabulary and out-of-vocabulary queries were observed in preliminary experiments.
Journal: Computer Speech & Language - Volume 28, Issue 5, September 2014, Pages 1045–1065