Article ID Journal Published Year Pages File Type
2819252 Gene 2008 4 Pages PDF
Abstract

The ENCODE (ENCyclopedia Of DNA Elements) project was launched three years ago with the purpose of identifying all of the functional elements in the human genome. ENCODE was started with 44 target sequences, which comprise 1% of the human genome. A crucial question about ENCODE is how representative it is of the human genome. Indeed, this is not a negligible problem if one considers that only 1% of the genome was selected for the project, and, more importantly, that the choice of the large DNA segments was based on two major criteria, namely the presence of extensively characterized genes and/or other functional elements, and the availability of a substantial amount of comparative sequence data. We found that the ENCODE data lead to an unbalanced representation of the compositional pattern of the human genome, especially for the GC-poorest and GC-richest regions. This unbalanced representativity of ENCODE can, however, be corrected by multiplying ENCODE data by a G/E factor (the ratio of whole genome data over ENCODE data), so amplifying the potential interest of ENCODE.

Related Topics
Life Sciences Biochemistry, Genetics and Molecular Biology Genetics
Authors
, , ,