Skip to yearly menu bar Skip to main content


LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs

Yunsheng Ma · Can Cui · Xu Cao · Wenqian Ye · Peiran Liu · Juanwu Lu · Amr Abdelraouf · Rohit Gupta · Kyungtae Han · Aniket Bera · James Rehg · Ziran Wang

Arch 4A-E Poster #52
[ ]
Thu 20 Jun 5 p.m. PDT — 6:30 p.m. PDT


Autonomous driving (AD) has made significant strides in recent years. However, existing frameworks struggle to interpret and execute spontaneous user instructions, such as "overtake the car ahead." Large Language Models (LLMs) have demonstrated impressive reasoning capabilities showing potential to bridge this gap. In this paper, we present LaMPilot, a novel framework that integrates LLMs into AD systems, enabling them to follow user instructions by generating code that leverages established functional primitives. We also introduce LaMPilot-Bench, the first benchmark dataset specifically designed to quantitatively evaluate the efficacy of language model programs in AD. Adopting the LaMPilot framework, we conduct extensive experiments to assess the performance of off-the-shelf LLMs on LaMPilot-Bench. Our results demonstrate the potential of LLMs in handling diverse driving scenarios and following user instructions in driving. To facilitate further research in this area, we release our code and data at

Live content is unavailable. Log in and register to view live content