Learn to Learn Weight Generation via Local Consistency Diffusion
Abstract
generation. However, existing solutions are limited by two challenges: generalizability and missing local supervision targets. The first challenge stems from the inherent lack of cross-task transferability in existing single-level optimization methods, which limits model performance on new tasks. The latter challenge lies in existing research modeling only global optimal weights, neglecting the supervision signals in local target weights. Furthermore, naively assigning local target weights leads to inconsistency between local and global objectives. To address these issues, we propose Mc-Di, which integrates the diffusion algorithm with meta-learning for better generalizability. Additionally, we extend the vanilla diffusion into a local consistency diffusion algorithm. Our theoretical analysis and experimental results demonstrate that the model can learn from local targets while preserving consistency with the global optimum. We validate Mc-Di's superior accuracy and inference efficiency on tasks that require frequent weight updates, including transfer learning, few-shot learning, domain generalization, and language model fine-tuning.