کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
395506 665988 2007 23 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Compressed histograms with arbitrary bucket layouts for selectivity estimation
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
Compressed histograms with arbitrary bucket layouts for selectivity estimation
چکیده انگلیسی

Selectivity estimation is an important step of query optimization in a database management system, and multi-dimensional histogram techniques have proved promising for selectivity estimation. Recent multi-dimensional histogram techniques such as GenHist and STHoles use an arbitrary bucket layout. This layout has the advantage of requiring a smaller number of buckets to model tuple densities than those required by the traditional grid or recursive layouts. However, the arbitrary bucket layout brings an inherent disadvantage of requiring more memory to store each bucket location information. This diminishes the advantage of requiring fewer buckets and, therefore, has an adverse effect on the resulting selectivity estimation accuracy. To our knowledge, however, no existing histogram-based technique with arbitrary layout addresses this issue. In this paper, we introduce the idea of bucket location compression and then demonstrate its effectiveness for improving selectivity estimation accuracy by proposing the STHoles+ technique. STHoles+ extends STHoles by quantizing each coordinate of a bucket relative to the coordinate of the smallest enclosing bucket. This quantization increases the number of histogram buckets that can be stored in the histogram. Our quantization scheme allows STHoles+ to trade precision of histogram bucket locations for storing more buckets. Experimental results show that STHoles+ outperforms STHoles on various data distributions, query distributions, and other factors such as available memory size, quantization resolution, and dimensionality of the data space.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Sciences - Volume 177, Issue 3, 1 February 2007, Pages 680–702
نویسندگان
, , ,