کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
430254 687949 2013 18 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Multi-route query processing and optimization
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Multi-route query processing and optimization
چکیده انگلیسی

A modern query optimizer typically picks a single query plan for all data based on overall data statistics. However, many have observed that real-life datasets tend to have non-uniform distributions. Selecting a single query plan may result in ineffective query execution for possibly large portions of the actual data. In addition most stream query processing systems, given the volume of data, cannot precisely model the system state much less account for uncertainty due to continuous variations. Such systems select a single query plan based upon imprecise statistics. In this paper, we present “Query Mesh” (or QM), a practical alternative to state-of-the-art data stream processing approaches. The main idea of QM is to compute multiple routes (i.e., query plans), each designed for a particular subset of the data with distinct statistical properties. We use terms “plans” and “routes” interchangeably in our work. A classifier model is induced and used to assign the best route to process incoming tuples based upon their data characteristics. We formulate the QM search space and analyze its complexity. Due to the substantial search space, we propose several cost-based query optimization heuristics designed to effectively find nearly optimal QMs. We propose the Self-Routing Fabric (SRF) infrastructure that supports query execution with multiple plans without physically constructing their topologies nor using a central router like Eddy. We also consider how to support uncertain route specification and execution in QM which can occur when imprecise statistics lead to more than one optimal route for a subset of data. Our experimental results indicate that QM consistently provides better query execution performance and incurs negligible overhead compared to the alternative state-of-the-art data stream approaches.


► Query mesh is a solution between plan-based systems and continuously reoptimizing solutions.
► Our classifier operator employs machine learning and inspects incoming tuples to determine the best routes for each tuple.
► Our Self-Routing Fabric infrastructure execute multiple routes in parallel.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Computer and System Sciences - Volume 79, Issue 3, May 2013, Pages 312–329
نویسندگان
, , , , ,