کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
448160 693538 2014 23 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Practical regular expression matching free of scalability and performance barriers
ترجمه فارسی عنوان
عبارات منظم عملی، بدون مقیاس پذیری و مانع عملکرد
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
چکیده انگلیسی

Regular expressions (regexes) provide rich expressiveness to specify the signatures of intrusions and are widely used in contemporary network security systems for signature-based intrusion detection. To perform very fast regex matching, deterministic finite automata (DFA) has been the first choice because its time complexity is constant O(1)O(1). Unfortunately, DFA often suffers the well known state explosion problem and, consequently, tends to require prohibitive memory overhead in practical applications. To address the problem, a wide variety of DFA compression techniques have been proposed; however, few can keep up with the ever increasing network traffic bandwidth and regex set complexity. This paper proposes that the DFA problem is rooted in regexes (rather than in DFA), i.e., semantic overlapping of regexes, and accordingly presents a complete algorithmic solution PaCC (Partition, Compression, and Combination), that can transform the given large-scale set of complex regexes into a compact and fast matching engine using DFA as its core. PaCC fundamentally defuses state explosion for DFA by partitioning complex regexes into overlapping-free segments. By exploiting the massive repetitiveness among the resulting segments, PaCC can further deflate corresponding DFA in terms of the number of states. Moreover, on the basis of the characteristics of these segments, PaCC takes a tailor-made compression approach and reduces over 96% of the state transitions for the corresponding DFA. In the final matching engine, the combination of DFA and a small relation mapping table, built from segments and their syntagmatic relations, respectively, guarantees high performance and semantic equivalence. Experimental evaluation shows that PaCC produces succinct matching engines with memory usage proportional to the size of the real-world Snort and Bro regex sets, with speeds of up to 1.7 Gbps per core on a HP Z220 SFF workstation with a 3.40 GHz Intel Core i7-3770.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computer Communications - Volume 54, 1 December 2014, Pages 97–119
نویسندگان
, , , ,