کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
2825576 1161963 2007 4 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Reconsidering the significance of genomic word frequencies
موضوعات مرتبط
علوم زیستی و بیوفناوری بیوشیمی، ژنتیک و زیست شناسی مولکولی ژنتیک
پیش نمایش صفحه اول مقاله
Reconsidering the significance of genomic word frequencies
چکیده انگلیسی

By conventional wisdom, a feature that occurs too often or too rarely in a genome can indicate a functional element. To infer functionality from frequency, it is crucial to precisely characterize occurrences in randomly evolving DNA. We find that the frequency of oligonucleotides in a genomic sequence follows primarily a Pareto-lognormal distribution, which encapsulates lognormal and power-law features found across all known genomes. Such a distribution could be the result of completely random evolution by a copying process. Our characterization of the entire frequency distribution of genomic words opens a way to a more accurate reasoning about their over- and underrepresentation in genomic sequences.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: - Volume 23, Issue 11, November 2007, Pages 543–546
نویسندگان
, , ,