Mapping Networks
Abstract
The escalating parameter counts in modern deep learningmodels pose a fundamental challenge to efficient trainingand resolution of overfitting. We address this by introducingthe Mapping Networks which replace the high dimensionalweight space by a compact, trainable latent vector based onthe hypothesis that the trained parameters of large networksreside on smooth, low-dimensional manifolds. Henceforth,the Mapping Theorem enforced by a dedicated MappingLoss, shows the existence of a mapping from this latentspace to the target weight space both theoretically and inpractice. Mapping Networks significantly reduce overfittingand achieve comparable to better performance than target network across complex vision and sequence tasks, including Image Classification, Deepfake Detection etc., with99.5%, i.e., around 500× reduction in trainable parameters.