Asymptotic spectra of the Fisher Information Matrix in random shallow neural networks
Neural Networks exhibit interesting properties in the asymptotic limit where the number of neurons diverges (the so-called infinite-width limit), as well as in the thermodynamic-proportional limit, where the number of training samples also diverges, proportionally to the number of neurons. This fact holds both in the dynamical regime of learning, as well as in the Bayesian setting. The Fisher Information Matrix (FIM) is a central object in statistics and machine learning. Within the framework of information geometry, largely developed by Amari, the FIM is the appropriate metric tensor on the parameter space, viewed as a Riemannian manifold. This geometric interpretation, when applied in the context of learning in neural networks, underlies methods such as natural gradient descent and can provide new insights into the geometric structure of both the parameter space and the loss landscape. In this talk, I will present our work in which we examine shallow neural networks with random parameters and analyze the asymptotic spectrum of their FIM using techniques from Random Matrix Theory and replica calculations. We derive expressions for the resolvent of the FIM in both the infinite-width limit (previously investigated by Amari) and the proportional limit. We perform calculations in the linear case and, following previous works, we assume the validity of a Wishart ansatz for the empirical kernels of non-linear networks in the proportional regime.