کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
425555 685780 2016 12 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Auditable versioned data storage outsourcing
ترجمه فارسی عنوان
ذخیره سازی داده ها با نسخه قابل اعتماد برون سپاری
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
چکیده انگلیسی


• Algorithms realizing a persistent data structure to support versioning are proposed.
• The function of auditing the latest version is extended to audit all versions of data.
• Unlike delta-based versioning expected worst-case is significantly better.
• Unlike tree-based approaches, it is achieved without any re-balancing operation.
• Our approach realizes block level deduplication across versions of data.

Auditability is crucial for data outsourcing, facilitating accountability and identifying data loss or corruption incidents in a timely manner, reducing in turn the risks from such losses. In recent years, in synch with the growing trend of outsourcing, a lot of progress has been made in designing probabilistic (for efficiency) provable data possession (PDP) schemes. However, even the recent and advanced PDP solutions that do deal with dynamic data, do so in a limited manner, and for only the latest version of the data. A naive solution treating different versions in isolation would work, but leads to tremendous overheads, and is undesirable. In this paper, we present algorithms to achieve full persistence (all intermediate configurations are preserved and are modifiable) for an optimized skip list (known as FlexList) so that versioned data can be audited. The proposed scheme provides deduplication at the level of logical, variable sized blocks, such that only the altered parts of the different versions are kept, while the persistent data-structure facilitates access (read) of any arbitrary version with the same storage and process efficiency that state-of-the-art dynamic PDP solutions provide for only the current version, while commit (write) operations incur around 5% additional time. Furthermore, the time overhead for auditing arbitrary versions in addition to the latest version is imperceptible even on a low-end server. Additionally, the application of our approach opens up the possibility to naturally support block level deduplication. While a naive solution to audit versions would copy the whole data and the data structure for each version, our solution utilizes storage space amounting very close to the most efficient delta-based solutions. Accordingly, we explore how the proposed data structure benefits the system with block level deduplication besides adding auditability property, and how it can be integrated with a state-of-the-art versioning system (Git), and in the process scale the storage efficiency of Git, and thus help scale the size of data to be stored in Git, without compromising the retrieval efficiency of arbitrary versions.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Future Generation Computer Systems - Volume 55, February 2016, Pages 17–28
نویسندگان
, ,