کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
2820630 1160872 2015 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
CS-SCORE: Rapid identification and removal of human genome contaminants from metagenomic datasets
موضوعات مرتبط
علوم زیستی و بیوفناوری بیوشیمی، ژنتیک و زیست شناسی مولکولی ژنتیک
پیش نمایش صفحه اول مقاله
CS-SCORE: Rapid identification and removal of human genome contaminants from metagenomic datasets
چکیده انگلیسی


• Rapid identification of host sequences contaminating metagenomic datasets
• Low memory footprint for handling datasets of any size
• Sequence compositional signatures based heuristic pre-filtering mechanism
• Directed-mapping approach using novel compositional metric (cs-score)

Metagenomic sequencing data, obtained from host-associated microbial communities, are usually contaminated with host genome sequence fragments. Prior to performing any downstream analyses, it is necessary to identify and remove such contaminating sequence fragments. The time and memory requirements of available host-contamination detection techniques are enormous. Thus, processing of large metagenomic datasets is a challenging task. This study presents CS-SCORE — a novel algorithm that can rapidly identify host sequences contaminating metagenomic datasets. Validation results indicate that CS-SCORE is 2–6 times faster than the current state-of-the-art methods. Furthermore, the memory footprint of CS-SCORE is in the range of 2–2.5 GB, which is significantly lower than other available tools. CS-SCORE achieves this efficiency by incorporating (1) a heuristic pre-filtering mechanism and (2) a directed-mapping approach that utilizes a novel sequence composition metric (cs-score). CS-SCORE is expected to be a handy ‘pre-processing’ utility for researchers analyzing metagenomic datasets.AvailabilityFor academic users, an implementation of CS-SCORE is freely available at: http://metagenomics.atc.tcs.com/cs-score (or) https://metagenomics.atc.tcs.com/preprocessing/cs-score.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Genomics - Volume 106, Issue 2, August 2015, Pages 116–121
نویسندگان
, , , , ,