کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
8420761 1545915 2016 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Comparison of different sequencing and assembly strategies for a repeat-rich fungal genome, Ophiocordyceps sinensis
موضوعات مرتبط
علوم زیستی و بیوفناوری بیوشیمی، ژنتیک و زیست شناسی مولکولی بیوتکنولوژی یا زیست‌فناوری
پیش نمایش صفحه اول مقاله
Comparison of different sequencing and assembly strategies for a repeat-rich fungal genome, Ophiocordyceps sinensis
چکیده انگلیسی
Ophiocordyceps sinensis is one of the most expensive medicinal fungi world-wide, and has been used as a traditional Chinese medicine for centuries. In a recent report, the genome of this fungus was found to be expanded by extensive repetitive elements after assembly of Roche 454 (223 Mb) and Illumina HiSeq (10.6 Gb) sequencing data, producing a genome of 87.7 Mb with an N50 scaffold length of 12 kb and 6972 predicted genes. To test whether the assembly could be improved by deeper sequencing and to assess the amount of data needed for optimal assembly, genomic sequencing was run several times on genomic DNA extractions of a single ascospore isolate (strain 1229) on an Illumina HiSeq platform (25 Gb total data). Assemblies were produced using different data types (raw vs. trimmed) and data amounts, and using three freely available assembly programs (ABySS, SOAP and Velvet). In nearly all cases, trimming the data for low quality base calls did not provide assemblies with higher N50 values compared to the non-trimmed data, and increasing the amount of input data (i.e. sequence reads) did not always lead to higher N50 values. Depending on the assembly program and data type, the maximal N50 was reached with between 50% to 90% of the total read data, equivalent to 100 × to 200 × coverage. The draft genome assembly was improved over the previously published version resulting in a 114 Mb assembly, scaffold N50 of 70 kb and 9610 predicted genes. Among the predicted genes, 9213 were validated by RNA-Seq analysis in this study, of which 8896 were found to be singletons. Evidence from genome and transcriptome analyses indicated that species assemblies could be improved with defined input material (e.g. haploid mono-ascospore isolate) without the requirement of multiple sequencing technologies, multiple library sizes or data trimming for low quality base calls, and with genome coverages between 100 × and 200 ×.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Microbiological Methods - Volume 128, September 2016, Pages 1-6
نویسندگان
, , , , , , , , ,