Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
5521180 | Drug Discovery Today | 2017 | 6 Pages |
â¢Analysis of massive NGS datasets poses difficult computational challenges.â¢Big data algorithms are often adapted for NGS analysis.â¢HPC becomes pivotal as NGS transcends from research labs to medical applications.
The progress of next-generation sequencing has a major impact on medical and genomic research. This high-throughput technology can now produce billions of short DNA or RNA fragments in excess of a few terabytes of data in a single run. This leads to massive datasets used by a wide range of applications including personalized cancer treatment and precision medicine. In addition to the hugely increased throughput, the cost of using high-throughput technologies has been dramatically decreasing. A low sequencing cost of around US$1000 per genome has now rendered large population-scale projects feasible. However, to make effective use of the produced data, the design of big data algorithms and their efficient implementation on modern high performance computing systems is required.