کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
6553173 1422141 2018 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Optimizing body fluid recognition from microbial taxonomic profiles
ترجمه فارسی عنوان
بهینه سازی تشخیص مایع بدن از پروفیل های طبقه بندی میکروبی
موضوعات مرتبط
علوم زیستی و بیوفناوری بیوشیمی، ژنتیک و زیست شناسی مولکولی ژنتیک
چکیده انگلیسی
In forensics the DNA-profile is used to identify the person who left a biological trace, but information on body fluid can also be essential in the evidence evaluation process. Microbial composition data could potentially be used for body fluid recognition as an improved alternative to the currently used presumptive tests. We have developed a customized workflow for interpretation of bacterial 16S sequence data based on a model composed of Partial Least Squares (PLS) in combination with Linear Discriminant Analysis (LDA). Large data sets from the Human Microbiome Project (HMP) and the American Gut Project (AGP) were used to test different settings in order to optimize performance. From the initial cross-validation of body fluid recognition within the HMP data, the optimal overall accuracy was close to 98%. Sensitivity values for the fecal and oral samples were ≥0.99, followed by the vaginal samples with 0.98 and the skin and nasal samples with 0.96 and 0.81 respectively. Specificity values were high for all 5 categories, mostly >0.99. This optimal performance was achieved by using the following settings: Taxonomic profiles based on operational taxonomic units (OTUs) with 0.98 identity (OTU98), Aitchisons simplex transform with C = 1 pseudo-count and no regularization (r = 1) in the PLS step. Variable selection did not improve the performance further. To test for robustness across sequencing platforms, we also trained the classifier on HMP data and tested on the AGP data set. In this case, the standard OTU based approach showed moderately decline in accuracy. However, by using taxonomic profiles made by direct assignment of reads to a genus, we were able to nearly maintain the high accuracy levels. The optimal combination of settings was still used, except the taxonomic level being genus instead of OTU98. The performance may be improved even further by using higher resolution taxonomic bins.
ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Forensic Science International: Genetics - Volume 37, November 2018, Pages 13-20
نویسندگان
, , , ,