Article ID Journal Published Year Pages File Type
4947397 Neurocomputing 2017 15 Pages PDF
Abstract
Although deep learning shows high performance in pattern recognition and machine learning, the reasons remain unclarified. To tackle this problem, we calculated the information theoretical variables of the representations in the hidden layers and analyzed their relationship to the performance. We found that entropy and mutual information, both of which decrease in a different way as the layer deepens, are related to the generalization errors after fine-tuning. This suggests that the information theoretical variables might be a criterion for determining the number of layers in deep learning without fine-tuning that requires high computational loads.
Related Topics
Physical Sciences and Engineering Computer Science Artificial Intelligence
Authors
, , ,