کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
396892 670620 2013 16 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
UpSizeR: Synthetically scaling an empirical relational database
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر هوش مصنوعی
پیش نمایش صفحه اول مقاله
UpSizeR: Synthetically scaling an empirical relational database
چکیده انگلیسی


• Data generation for benchmarks: replacing domain-specific TPC with application-specific UpSizeR.
• Given an empirical database DD and scale factor s  , generate a synthetic D′D′ similar to DD but s times its size.
• Experiments with Flickr show results that match crawled data and predict throughput degradation.
• Issue raised: How do social interactions affect attribute correlation in a database?
• Research program proposal: application-specific benchmarking of data-centric systems.

The TPC benchmarks have helped users evaluate database system performance at different scales. Although each benchmark is domain-specific, it is not equally relevant to different applications in the same domain. The present proliferation of applications also leaves many of them uncovered by the very limited number of current TPC benchmarks.There is therefore a need to develop tools for application-specific database benchmarking. This paper presents UpSizeR, a software that addresses the Dataset Scaling Problem: Given an empirical set of relational tables  DDand a scale factor s, generate a database state  D˜that is similar to  DDbut s times its size.   Such a tool can be useful for scaling up DD for scalability testing (s>1)(s>1), scaling down for application testing (s<1)(s<1), or anonymization (s=1).Experiments with Flickr show that query results and response times on UpSizeR output match those on crawled data. They also accurately predict throughput degradation for a scale out test.The UpSizeR version in this paper focuses on extracting and replicating the correlation induced by the primary and foreign keys. There are many other forms of correlation involving non-key values. It is a large task to develop UpSizeR into a tool that can extract and replicate all important correlation, so community effort is required. The current UpSizeR code has therefore been released for open-source development. The ultimate objective is to replace TPC with UpSizeR, so database owners can generate benchmarks that are relevant to their applications.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Systems - Volume 38, Issue 8, November 2013, Pages 1168–1183
نویسندگان
, , , , , ,