V-Max: A RL Framework for Autonomous Driving

By Valentin Charraut, Waël Doulazmi, Thomas Tournaire, and Thibault Buhet

Reinforcement Learning Journal, vol. TBD, 2025, pp. TBD.

Presented at the Reinforcement Learning Conference (RLC), Edmonton, Alberta, Canada, August 5–9, 2025.


Download:

Abstract:

Learning-based decision-making has the potential to enable generalizable Autonomous Driving (AD) policies, reducing the engineering overhead of rule-based approaches. Imitation Learning (IL) remains the dominant paradigm, benefiting from large-scale human demonstration datasets, but it suffers from inherent limitations such as distribution shift and imitation gaps. Reinforcement Learning (RL) presents a promising alternative, yet its adoption in AD remains limited due to the lack of standardized and efficient research frameworks. To this end, we introduce V-Max, an open research framework providing all the necessary tools to facilitate RL research for AD. V-Max is built on Waymax, a hardware-accelerated AD simulator designed for large-scale experimentation. We extend it using ScenarioNet approach, enabling the fast simulation of diverse AD datasets. V-Max integrates a set of observation and reward functions, transformer-based encoders, and training pipelines. Additionally, it includes adversarial evaluation settings and an extensive set of evaluation metrics. Through a large-scale benchmark, we investigate how network architectures, observation functions, training data, and reward shaping impact RL performance.


Citation Information:

Valentin Charraut, Waël Doulazmi, Thomas Tournaire, and Thibault Buhet. "V-Max: A RL Framework for Autonomous Driving." Reinforcement Learning Journal, vol. TBD, 2025, pp. TBD.

BibTeX:
@article{charraut2025framework,
    title={{V-Max}: {A} {RL} Framework for Autonomous Driving},
    author={Charraut, Valentin and Doulazmi, Wa{\"{e}}l and Tournaire, Thomas and Buhet, Thibault},
    journal={Reinforcement Learning Journal},
    year={2025}
}