Article ID Journal Published Year Pages File Type
418587 Discrete Applied Mathematics 2015 22 Pages PDF
Abstract

Tandem duplication is a rearrangement process whereby a segment of DNA is replicated and proximally inserted. A sequence of these events is termed an evolution. Many different configurations can arise from such evolutions, generating some interesting combinatorial properties. Firstly, new DNA connections arising in an evolution can be algebraically represented with a word producing automaton. The number of words arising from nn tandem duplications can then be recursively derived. Secondly, many distinct evolutions result in the same sequence of words. With the aid of a bi-colored 2d-tree, a Hasse diagram corresponding to a partially ordered set is constructed, for which the number of linear extensions equates to the number of evolutions generating a given word sequence. Thirdly, we implement some subtree prune and graft operations on this structure to show that the total number of possible evolutions arising from nn tandem duplications is ∏k=1n(4k−(2k+1)). The space of structures arising from tandem duplication thus grows at a super-exponential rate with leading order term O(412n2).

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics
Authors
, , ,