کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
10231879 | 1373 | 2014 | 10 صفحه PDF | دانلود رایگان |
عنوان انگلیسی مقاله ISI
Characterizing regions in the human genome unmappable by next-generation-sequencing at the read length of 1000 bases
ترجمه فارسی عنوان
مشخص کردن مناطق در ژنوم انسان قابل تعویض توسط توالی نسل بعدی در طول خواندن 1000 پایه
دانلود مقاله + سفارش ترجمه
دانلود مقاله ISI انگلیسی
رایگان برای ایرانیان
موضوعات مرتبط
مهندسی و علوم پایه
مهندسی شیمی
بیو مهندسی (مهندسی زیستی)
چکیده انگلیسی
Repetitive and redundant regions of a genome are particularly problematic for mapping sequencing reads. In the present paper, we compile a list of the unmappable regions in the human genome based on the following definition: hypothetical reads with length 1Â kb which cannot be uniquely mapped with zero-mismatch alignment for the described regions, considering both the forward and reverse strand. The respective collection of unmappable regions covers 0.77% of the sequence of human autosomes and 8.25% of the sex chromosomes in the reference genome GRCh37/hg19 (overall 1.23%). Not surprisingly, our unmappable regions overlap greatly with segmental duplication, transposable elements, and structural variants. About 99.8% of bases in our unmappable regions are part of either segmental duplication or transposable elements and 98.3% overlap structural variant annotations. Notably, some of these regions overlap units with important biological functions, including 4% of protein-coding genes. In contrast, these regions have zero intersection with the ultraconserved elements, very low overlap with microRNAs, tRNAs, pseudogenes, CpG islands, tandem repeats, microsatellites, sensitive non-coding regions, and the mapping blacklist regions from the ENCODE project.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computational Biology and Chemistry - Volume 53, Part A, December 2014, Pages 108-117
Journal: Computational Biology and Chemistry - Volume 53, Part A, December 2014, Pages 108-117
نویسندگان
Wentian Li, Jan Freudenberg,