Unlocking the future where autonomous driving meets
the unlimited potential of language


DriveLM is an autonomous driving (AD) dataset incorporating linguistic information, which is based on the prevailing nuScenes dataset. Through DriveLM, we want to connect large language models and autonomous driving systems.

Specifically, in the dataset, we facilitate Perception, Prediction, and Planning (P3) with human-written reasoning logic as a connection. To take it a step further, we leverage the idea of Graph-of-Thought (GoT) to connect the QA pairs in a graph-style structure and use "What if"-style questions to reason about future events.

You can visit our GitHub repository for more details.


In the DriveLM dataset, QAs are connected in a graph-style structure, with QA pairs as every node, and objects' relationships as the edges.

Perception, Prediction, Planning

The most central element of DriveLM is frame-wise P3 QA, where P3 stands for Perception, Prediction, and Planning. This allows us to achieve complete functionality in full-stack autonomous driving.

What if

We try to reason about future events that have not yet happened. The way we do this is to ask many "What if"-style questions, which is a common way for humans to imagine the future by language.

Please consider citing our project if it helps your research.
            title={DriveLM: Drive on Language},
            author={DriveLM Contributors},