Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
10231882 | Computational Biology and Chemistry | 2014 | 24 Pages |
Abstract
Genomic sequences exhibit self-organization properties at various hierarchical levels. One such is the gene structure of higher eukaryotes with its complex exon/intron arrangement. Exon sizes and exon numbers in genes have been shown to conform to a law derived from statistical linguistics and formulated by Menzerath and Altmann, according to which the mean size of the constituents of an entity is inversely related to the number of these constituents. We herein perform a detailed analysis of this property in the complete exon set of the mouse genome in correlation to the sequence conservation of each exon and the transcriptional complexity of each gene locus. We show that extensive linear fits, representative of accordance to Menzerath-Altmann law are restricted to a particular subset of genes that are formed by exons under low or intermediate sequence constraints and have a small number of alternative transcripts. Based on this observation we propose a hypothesis for the law of Menzerath-Altmann in mammalian genes being predominantly due to genes that are more versatile in function and thus, more prone to undergo changes in their structure. To this end we demonstrate one test case where gene categories of different functionality also show differences in the extent of conformity to Menzerath-Altmann law.
Keywords
Related Topics
Physical Sciences and Engineering
Chemical Engineering
Bioengineering
Authors
Christoforos Nikolaou,