کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
505011 864466 2015 8 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Allele frequency calibration for SNP based genotyping of DNA pools: A regression based local–global error fusion method
ترجمه فارسی عنوان
کالیبراسیون فرکانس آلل برای ژنوتیپ مبتنی بر SNP از مخازن DNA: یک روش تلفیق خطای سراسری ـ محلی مبتنی بر رگرسیون
کلمات کلیدی
کالیبراسیون فرکانس آلل ؛ جمع آوری DNA؛ ژنوتیپ SNP؛ ریزآرایه؛ یادگیری ماشین
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
چکیده انگلیسی


• A novel allele frequency estimation method.
• Machine learning based approach to estimation.
• A local–global error fusion method.

BackgroundThe costs associated with developing high density microarray technologies are prohibitive for genotyping animals when there is low economic value associated with a single animal (e.g. prawns). DNA pooling is an attempt to address this issue by combining multiple DNA samples prior to genotyping. Instead of genotyping the DNA samples of the individuals, a mixture of DNA samples (i.e. the pool) from the individuals is genotyped only once. This greatly reduces the cost of genotyping. Pooled samples are subject to greater genotyping inaccuracies than individual samples. Wrong genotyping will lead to wrong biological conclusions. It is thus required to calibrate the resulting genotypes (allele frequencies).MethodsWe present a regression based approach to translate raw array output to allele frequency. During training, few pools and the individuals that constitute the pools are genotyped. Given the genotypes of individuals that constitute the pool, we compute the true allele frequency. We then train a regression algorithm to produce a mapping between the raw array outputs to the true allele frequency. We test the algorithm using pool samples withheld from the training set. During prediction, we use this map to genotype pools with no prior knowledge of the individuals constituting the pools.Results and discussionAfter data quality control we have available a dataset comprised of 912 pools. We estimate allele frequency using three approaches: the raw data, a commonly used piecewise linear transformation, and the proposed local–global learner fusion method. The resulting RMS errors for the three approaches are 0.135, 0.120, and 0.080 respectively.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Computers in Biology and Medicine - Volume 61, 1 June 2015, Pages 48–55
نویسندگان
, , , ,