کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
2833821 1570815 2014 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Should genes with missing data be excluded from phylogenetic analyses?
ترجمه فارسی عنوان
آیا ژن هایی با داده های از دست رفته باید از تجزیه و تحلیل فیلوژنتیک حذف شوند؟
کلمات کلیدی
موضوعات مرتبط
علوم زیستی و بیوفناوری علوم کشاورزی و بیولوژیک بوم شناسی، تکامل، رفتار و سامانه شناسی
چکیده انگلیسی


• Concatenated data matrices often include one or more genes with missing data.
• The impact of including such genes on phylogenetic accuracy is largely unstudied.
• We examined the costs and benefits of adding vs. excluding genes with missing data.
• Adding incomplete genes increased the accuracy of phylogenetic analyses.
• Adding incomplete genes was especially helpful for resolving poorly resolved nodes.

Phylogeneticists often design their studies to maximize the number of genes included but minimize the overall amount of missing data. However, few studies have addressed the costs and benefits of adding characters with missing data, especially for likelihood analyses of multiple loci. In this paper, we address this topic using two empirical data sets (in yeast and plants) with well-resolved phylogenies. We introduce varying amounts of missing data into varying numbers of genes and test whether the benefits of excluding genes with missing data outweigh the costs of excluding the non-missing data that are associated with them. We also test if there is a proportion of missing data in the incomplete genes at which they cease to be beneficial or harmful, and whether missing data consistently bias branch length estimates. Our results indicate that adding incomplete genes generally increases the accuracy of phylogenetic analyses relative to excluding them, especially when there is a high proportion of incomplete genes in the overall dataset (and thus few complete genes). Detailed analyses suggest that adding incomplete genes is especially helpful for resolving poorly supported nodes. Given that we find that excluding genes with missing data often decreases accuracy relative to including these genes (and that decreases are generally of greater magnitude than increases), there is little basis for assuming that excluding these genes is necessarily the safer or more conservative approach. We also find no evidence that missing data consistently bias branch length estimates.

Figure optionsDownload as PowerPoint slide

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Molecular Phylogenetics and Evolution - Volume 80, November 2014, Pages 308–318
نویسندگان
, , , , ,