BuildingGPT: Auto-Regressive Building Wireframe Reconstruction Model with Reinforcement Learning
Abstract
In this paper, we propose BuildingGPT, a novel auto-regressive model for building wireframe reconstruction from point clouds with reinforcement learning.Unlike prior works based on detection or diffusion models, BuildingGPT reformulates the building wireframe reconstruction task into a sequence prediction problem.Based on the hierarchical building wireframe tokenization, the wireframe sequences are organized in a structurally- and semantically-aware order for the next-token prediction.The point cloud encoder first transforms the input point cloud into a fixed-length latent code that serves as the starting of the sequence.Then, BuildingGPT auto-regressively predicts tokens conditioned on the latent code and previously generated tokens.With token sequence predicted, the building wireframe is obtained through detokenization.To enhance the model performance, we adopt a two-stage training paradigm including the pre-training and post-training.After the auto-regressive pre-training, Direct Preference Optimization (DPO) is employed as a post-training strategy to align reconstruction results with human preferences.Extensive experiments on the large-scale MunichWF dataset show that BuildingGPT outperforms existing state-of-the-art methods.We commit to release the code and dataset.