کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
431450 | 688550 | 2015 | 14 صفحه PDF | دانلود رایگان |
• Comprehensive design exploration for a multithreaded object detection algorithm.
• A Multi-Staged Classifier Grouping scheme to improve data reuse on the local cache.
• A self adaptable design flow to auto-tune design parameters for a multicore system.
• In-depth performance evaluation with an ARM-based cycle accurate simulator.
Leveraging multithreading on embedded multicore platforms has been proven effective on handling the increasing resolutions of target stimuli of object detection. However, complex tradeoffs and correlated design impacts between a parallel application and the underlying multicore platform necessitate an effective and adaptable multithreaded design. This paper introduces a hybrid multithreaded object detection with high parallelism and extensive data reuse. A self adaptable flow is proposed to adjust the multithreaded object detection to fully exploit various embedded multicore architectures. The ARM-based cycle accurate simulations of multicore systems have shown the superior performance returned by the proposed design.
Journal: Journal of Parallel and Distributed Computing - Volume 78, April 2015, Pages 25–38