Coordinate Denoising for Non‑Equilibrium Molecular Representation Learning
Abstract
Three-dimensional molecular representation learning has shown great promise in modeling chemical structures and their properties. However, most existing approaches implicitly assume molecules are at or near equilibrium states. This assumption breaks down for non-equilibrium structures—ubiquitous in molecular dynamics (MD) trajectories—where standard coordinate denoising techniques fail because the direct equivalence between denoising scores and atomic forces no longer holds. To bridge this gap, we propose Node Denoising on non-Equilibrium Molecules (NDeM), a novel auxiliary task grounded in a second-order finite difference approximation of the potential energy surface. By explicitly accounting for the non-zero inherent forces in non-equilibrium states, NDeM provides a theoretically sound denoising objective applicable to arbitrary molecular conformations. Crucially, our method is designed as a lightweight, architecture-agnostic plugin that requires no pre-training and can be seamlessly integrated into various supervised learning pipelines. Extensive experiments across diverse benchmarks, including MD17, QM9, and the large-scale OC20 dataset, demonstrate that NDeM consistently improves baseline models, yielding highly competitive performance and validating its robustness across both equilibrium and non-equilibrium regimes.