Discovering subword associations in strings in time linear in the output size

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
431107	688275	2009	12 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Pattern Matching - تطبیق الگو String searching - رشته جستجو Association rule - قانون انجمن

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات

پیش نمایش صفحه اول مقاله

Discovering subword associations in strings in time linear in the output size

چکیده انگلیسی

Given a text string x of n symbols and an integer constant d , we consider the problem of finding, for any pair (y,z)(y,z) of subwords of x, the tandem index associated with the pair, which is defined as the number of times that y and z occur in tandem (i.e., with no intermediate occurrence of either one of them) within a distance of d symbols of x . Although in principle there might be O(n4)O(n4) distinct subword pairs in x , it is seen that it suffices to consider a family of only O(n2)O(n2) such pairs, with the property that for any neglected pair (y′,z′)(y′,z′) there exists a corresponding pair (y,z)(y,z) contained in our family such that: (i) y′y′ is a prefix of y and z′z′ is a prefix of z ; and (ii) the tandem index of (y′,z′)(y′,z′) equals that of (y,z)(y,z). The main contribution of the paper consists of an algorithm showing that the computation of all non-zero tandem indices for a string can be carried out optimally in time and space linear in the size of the output.

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Discrete Algorithms - Volume 7, Issue 2, June 2009, Pages 227–238

نویسندگان

Alberto Apostolico, Giorgio Satta,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Discovering subword associations in strings in time linear in the output size

دسترسی سریع

ارتباط

English Website