کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
434102 1441659 2015 21 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Automatic generation of valid and invalid test data for string validation routines using web searches and regular expressions
ترجمه فارسی عنوان
نسل کشی داده های معتبر و معتبر برای روش های اعتبارسنجی رشته با استفاده از جستجو های وب و عبارات منظم
کلمات کلیدی
تست تولید داده ها، جستجوهای وب، عبارات منظم
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
چکیده انگلیسی


• An approach for finding valid values for string data types on the Internet.
• A mutation algorithm for regular expressions to produce invalid values for string data types.
• A testing procedure to identify program errors using the valid and invalid values.
• An empirical study of the approach on 24 open source case studies.
• An analysis of the approach against two contemporary test data generation tools.

Classic approaches to automatic input data generation are usually driven by the goal of obtaining program coverage and the need to solve or find solutions to path constraints to achieve this. As inputs are generated with respect to the structure of the code, they can be ineffective, difficult for humans to read, and unsuitable for testing missing implementation. Furthermore, these approaches have known limitations when handling constraints that involve operations with string data types.This paper presents a novel approach for generating string test data for string validation routines, by harnessing the Internet. The technique uses program identifiers to construct web search queries for regular expressions that validate the format of a string type (such as an email address). It then performs further web searches for strings that match the regular expressions, producing examples of test cases that are both valid and realistic. Following this, our technique mutates the regular expressions to drive the search for invalid strings, and the production of test inputs that should be rejected by the validation routine.The paper presents the results of an empirical study evaluating our approach. The study was conducted on 24 string input validation routines collected from 10 open source projects. While dynamic symbolic execution and search-based testing approaches were only able to generate a very low number of values successfully, our approach generated values with an accuracy of 34% on average for the case of valid strings, and 99% on average for the case of invalid strings. Furthermore, whereas dynamic symbolic execution and search-based testing approaches were only capable of detecting faults in 8 routines, our approach detected faults in 17 out of the 19 validation routines known to contain implementation errors.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Science of Computer Programming - Volume 97, Part 4, 1 January 2015, Pages 405–425
نویسندگان
, , ,