کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
5513317 1541199 2017 9 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Detecting exact breakpoints of deletions with diversity in hepatitis B viral genomic DNA from next-generation sequencing data
موضوعات مرتبط
علوم زیستی و بیوفناوری بیوشیمی، ژنتیک و زیست شناسی مولکولی زیست شیمی
پیش نمایش صفحه اول مقاله
Detecting exact breakpoints of deletions with diversity in hepatitis B viral genomic DNA from next-generation sequencing data
چکیده انگلیسی


- The proposed VirDelect detects exact breakpoints of deletions with characteristics of HBV genomic DNA.
- Three phases are proposed to efficiently reduce computation cost of split read alignment without losing accuracy.
- VirDelect was validated on both simulation and real data to prove its feasibility.

Many studies have suggested that deletions of Hepatitis B Viral (HBV) are associated with the development of progressive liver diseases, even ultimately resulting in hepatocellular carcinoma (HCC). Among the methods for detecting deletions from next-generation sequencing (NGS) data, few methods considered the characteristics of virus, such as high evolution rates and high divergence among the different HBV genomes. Sequencing high divergence HBV genome sequences using the NGS technology outputs millions of reads. Thus, detecting exact breakpoints of deletions from these big and complex data incurs very high computational cost. We proposed a novel analytical method named VirDelect (Virus Deletion Detect), which uses split read alignment base to detect exact breakpoint and diversity variable to consider high divergence in single-end reads data, such that the computational cost can be reduced without losing accuracy. We use four simulated reads datasets and two real pair-end reads datasets of HBV genome sequence to verify VirDelect accuracy by score functions. The experimental results show that VirDelect outperforms the state-of-the-art method Pindel in terms of accuracy score for all simulated datasets and VirDelect had only two base errors even in real datasets. VirDelect is also shown to deliver high accuracy in analyzing the single-end read data as well as pair-end data. VirDelect can serve as an effective and efficient bioinformatics tool for physiologists with high accuracy and efficient performance and applicable to further analysis with characteristics similar to HBV on genome length and high divergence. The software program of VirDelect can be downloaded at https://sourceforge.net/projects/virdelect/.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Methods - Volume 129, 1 October 2017, Pages 24-32
نویسندگان
, , , , ,