Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
4947397 | Neurocomputing | 2017 | 15 Pages |
Abstract
Although deep learning shows high performance in pattern recognition and machine learning, the reasons remain unclarified. To tackle this problem, we calculated the information theoretical variables of the representations in the hidden layers and analyzed their relationship to the performance. We found that entropy and mutual information, both of which decrease in a different way as the layer deepens, are related to the generalization errors after fine-tuning. This suggests that the information theoretical variables might be a criterion for determining the number of layers in deep learning without fine-tuning that requires high computational loads.
Related Topics
Physical Sciences and Engineering
Computer Science
Artificial Intelligence
Authors
Yasutaka Furusho, Takatomi Kubo, Kazushi Ikeda,