کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
435394 | 689902 | 2009 | 12 صفحه PDF | دانلود رایگان |

Words that appear as constrained subsequences in a text-string are considered as possible indicators of the host string structure, hence also as a possible means of sequence comparison and classification. The constraint consists of imposing a bound on the number ω of positions in the text that may intervene between any two consecutive characters of a subsequence. A subset of such ω-sequences is then characterized that consists, in intuitive terms, of sequences that could not be enriched with more characters without losing some occurrence in the text. A compact spatial representation is then proposed for these representative sequences, within which a number of parameters can be defined and measured. In the final part of the paper, such parameters are empirically analyzed on a small collection of text-strings endowed with various degrees of structure.
Journal: Theoretical Computer Science - Volume 410, Issue 43, 6 October 2009, Pages 4360-4371