دانلود رایگان مقاله: نمایه سازی کلمات کلیدی برای ویدیوهای سخنرانی با استفاده از نظارت بر کلمات کلیدی بدون نظارت

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
533940	870192	2016	8 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

Assisted keyword indexing for lecture videos using unsupervised keyword spotting

ترجمه فارسی عنوان

نمایه سازی کلمات کلیدی برای ویدیوهای سخنرانی با استفاده از نظارت بر کلمات کلیدی بدون نظارت

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Mel frequency cepstral coefficients Unsupervised - محافظت نشده Keyword spotting - کلمات کلیدی لبخند

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو

پیش نمایش مقاله

نمایه سازی کلمات کلیدی برای ویدیوهای سخنرانی با استفاده از نظارت بر کلمات کلیدی بدون نظارت

چکیده انگلیسی

• Created a completely unsupervised within-speaker keyword spotting system to create accessible index.
• Average Precision at 10 of 71.5% and 79.5% for laptop recorded and in-lecture queries for RIT lectures.
• Whitening is used to reduce variance in MFCC feature vectors (performance increase of 58%).
• In Table 1 and the accompanying text we explicitly define our criteria for defining ‘valid’ search hits.
• MIT lectures recorded on lapel microphone have the average Precision at 10 of 89.5%.

Many students use videos to supplement learning outside the classroom. This is particularly important for students with challenged visual capacities, for whom seeing the board during lecture is difficult. For these students, we believe that recording the lectures they attend and providing effective video indexing and search tools will make it easier for them to learn course subject matter at their own pace. As a first step in this direction, we seek to help instructors create an index for their lecture videos using audio keyword search, with queries recorded by the instructor on their laptop and/or created from video excerpts. For this we have created an unsupervised within-speaker keyword spotting system. We represent audio data using de-noised, whitened and scale-normalized Mel Frequency Cepstral Coefficient (MFCC) features, and locate queries using Segmental Dynamic Time Warping (SDTW) of feature sequences. Our system is evaluated using introductory Linear Algebra lectures from instructors with different accents at two U.S. universities. For lectures produced using a video camera at RIT, laptop-recorded queries obtain an average Precision at 10 of 71.5%, while 79.5% is obtained for within-lecture queries. For lectures recorded using a lapel microphone at MIT, using a similar keyword set we obtain a much higher average Precision at 10 of 89.5%. Our results suggest that our system is robust to changes in environment, speaker and recording setup.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 71, 1 February 2016, Pages 8–15

نویسندگان

Manish Kanadje, Zachary Miller, Anurag Agarwal, Roger Gaborski, Richard Zanibbi, Stephanie Ludi,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

دانلود رایگان مقاله ISI : نمایه سازی کلمات کلیدی برای ویدیوهای سخنرانی با استفاده از نظارت بر کلمات کلیدی بدون نظارت

دسترسی سریع

ارتباط

English Website