Article ID Journal Published Year Pages File Type
419205 Discrete Applied Mathematics 2016 13 Pages PDF
Abstract

A repetition-free Longest Common Subsequence (LCS) of two sequences xx and yy is an LCS of xx and  yy where each symbol may appear at most once. Let RR denote the length of a repetition-free LCS of two sequences of nn symbols each one chosen randomly, uniformly, and independently over a kk-ary alphabet. We study the asymptotic, in nn and kk, behavior of RR and establish that there are three distinct regimes, depending on the relative speed of growth of nn and kk. For each regime we establish the limiting behavior of RR. In fact, we do more, since we actually establish tail bounds for large deviations of RR from its limiting behavior.Our study is motivated by the so called exemplar model proposed by Sankoff (1999) and the related similarity measure introduced by Adi et al. (2010). A natural question that arises in this context, which as we show is related to long standing open problems in the area of probabilistic combinatorics, is to understand the asymptotic, in nn and kk, behavior of parameter RR.

Keywords
Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, ,