کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
514970 866926 2015 23 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Learning combination weights in data fusion using Genetic Algorithms
ترجمه فارسی عنوان
آموزش وزن ترکیبی همجوشی داده ها با استفاده از الگوریتم ژنتیک
کلمات کلیدی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نرم افزارهای علوم کامپیوتر
چکیده انگلیسی


• Genetic Algorithm is an effective scheme in determining data fusion weights.
• Tuning Genetic Algorithm increases time efficiency.
• Weight learning from only top ranked documents is useful.
• Redundant runs can be removed based on correlation between scores.

Researchers have shown that a weighted linear combination in data fusion can produce better results than an unweighted combination. Many techniques have been used to determine the linear combination weights. In this work, we have used the Genetic Algorithm (GA) for the same purpose. The GA is not new and it has been used earlier in several other applications. But, to the best of our knowledge, the GA has not been used for fusion of runs in information retrieval. First, we use GA to learn the optimum fusion weights using the entire set of relevance assessment. Next, we learn the weights from the relevance assessments of the top retrieved documents only. Finally, we also learn the weights by a twofold training and testing on the queries. We test our method on the runs submitted in TREC. We see that our weight learning scheme, using both full and partial sets of relevance assessment, produces significant improvements over the best candidate run, CombSUM, CombMNZ, Z-Score, linear combination method with performance level, performance level square weighting scheme, multiple linear regression-based weight learning scheme, mixture model result merging scheme, LambdaMerge, ClustFuseCombSUM and ClustFuseCombMNZ. Furthermore, we study how the correlation among the scores in the runs can be used to eliminate redundant runs in a set of runs to be fused. We observe that similar runs have similar contributions in fusion. So, eliminating the redundant runs in a group of similar runs does not hurt fusion performance in any significant way.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Information Processing & Management - Volume 51, Issue 3, May 2015, Pages 306–328
نویسندگان
, , ,