Regular expression constrained sequence alignment

کد مقاله	کد نشریه	سال انتشار	مقاله انگلیسی	نسخه تمام متن
430937	688234	2007	15 صفحه PDF	دانلود رایگان

عنوان انگلیسی مقاله ISI

دانلود مقاله + سفارش ترجمه

دانلود مقاله ISI انگلیسی

رایگان برای ایرانیان

کلمات کلیدی

Finite automaton - اتوماتای محدود Dynamic programming - برنامه‌ریزی پویا یا برنامه‌ نویسی پویا Sequence alignment - ترتیب توالی Pattern Matching - تطبیق الگو Regular expression - عبارت منظم

موضوعات مرتبط

مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات

پیش نمایش صفحه اول مقاله

چکیده انگلیسی

We introduce regular expression constrained sequence alignment as the problem of finding the maximum alignment score between given strings S1S1 and S2S2 over all alignments such that in these alignments there exists a segment where some substring s1s1 of S1S1 is aligned to some substring s2s2 of S2S2, and both s1s1 and s2s2 match a given regular expression R , i.e. s1,s2∈L(R)s1,s2∈L(R) where L(R)L(R) is the regular language described by R . For complexity results we assume, without loss of generality, that n=|S1|⩾|m|=|S2|n=|S1|⩾|m|=|S2|. A motivation for the problem is that protein sequences can be aligned in a way that known motifs guide the alignments. We present an O(nmr)O(nmr) time algorithm for the regular expression constrained sequence alignment problem where r=O(t4)r=O(t4), and t is the number of states of a nondeterministic finite automaton N that accepts L(R)L(R). We use in our algorithm a nondeterministic weighted finite automaton M that we construct from N. M has O(t2)O(t2) states where the transition-weights are obtained from the given costs of edit operations, and state-weights correspond to optimum alignment scores we compute using the underlying dynamic programming solution for sequence alignment. If we are given a deterministic finite automaton D accepting L(R)L(R) with tdtd states then our construction creates a deterministic finite automaton MdMd with td2 states. In this case, our algorithm takes O(td2nm) time. Using MdMd results in faster computation than using M when td

ناشر

Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Discrete Algorithms - Volume 5, Issue 4, December 2007, Pages 647–661

نویسندگان

Abdullah N. Arslan,

علوم انسانی و هنر

فنی، مهندسی و علوم پایه

پزشکی و سلامت

بیو تکنولوژی

پذیرش سفارش ترجمه

Regular expression constrained sequence alignment

دسترسی سریع

ارتباط

English Website