کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
569019 | 876519 | 2006 | 16 صفحه PDF | دانلود رایگان |
We propose a cross-media lecture-on-demand system, called lodem, which searches a lecture video for specific segments in response to a text query. We utilize the benefits of text, audio, and video data corresponding to a single lecture. lodem extracts the audio track from a target lecture video, generates a transcription by large-vocabulary continuous speech recognition, and produces a text index. A user can formulate text queries using the textbook related to the target lecture and can selectively view specific video segments by submitting those queries. Experimental results showed that by adapting speech recognition to the lecturer and the topic of the target lecture, the recognition accuracy was increased and consequently the retrieval accuracy was comparable with that obtained by human transcription. lodem is implemented as a client–server system on the Web to facilitate e-learning.
Journal: Speech Communication - Volume 48, Issue 5, May 2006, Pages 516–531