کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
406343 | 678078 | 2014 | 19 صفحه PDF | دانلود رایگان |
We are interested in developing a safe semi-supervised learning that works in any situation. Semi-supervised learning postulates that n′n′ unlabeled data are available in addition to nn labeled data. However, almost all of the previous semi-supervised methods require additional assumptions (not only unlabeled data) to make improvements on supervised learning. If such assumptions are not met, then the methods possibly perform worse than supervised learning. Sokolovska, Cappé, and Yvon (2008) proposed a semi-supervised method based on a weighted likelihood approach. They proved that this method asymptotically never performs worse than supervised learning (i.e., it is safe) without any assumption. Their method is attractive because it is easy to implement and is potentially general. Moreover, it is deeply related to a certain statistical paradox. However, the method of Sokolovska et al. (2008) assumes a very limited situation, i.e., classification, discrete covariates, n′→∞n′→∞ and a maximum likelihood estimator. In this paper, we extend their method by modifying the weight. We prove that our proposal is safe in a significantly wide range of situations as long as n≤n′n≤n′. Further, we give a geometrical interpretation of the proof of safety through the relationship with the above-mentioned statistical paradox. Finally, we show that the above proposal is asymptotically safe even when n′
Journal: Neural Networks - Volume 53, May 2014, Pages 146–164