Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
2820636 | Genomics | 2015 | 7 Pages |
•GLMMs detection of associations between genomic loci and environmental variables•Simulated data analysis to test GLMMs in such a study and optimize sampling strategy•Detection of genomic regions associated to temperature in Arabidopsis SNP dataset•Significant candidate genes found in these A. thaliana genomic regions
We tested the use of Generalized Linear Mixed Models to detect associations between genetic loci and environmental variables, taking into account the population structure of sampled individuals. We used a simulation approach to generate datasets under demographically and selectively explicit models. These datasets were used to analyze and optimize GLMM capacity to detect the association between markers and selective coefficients as environmental data in terms of false and true positive rates. Different sampling strategies were tested, maximizing the number of populations sampled, sites sampled per population, or individuals sampled per site, and the effect of different selective intensities on the efficiency of the method was determined. Finally, we apply these models to an Arabidopsis thaliana SNP dataset from different accessions, looking for loci associated with spring minimal temperature. We identified 25 regions that exhibit unusual correlations with the climatic variable and contain genes with functions related to temperature stress.