Reinforcement Learning Journal, vol. TBD, 2025, pp. TBD.
Presented at the Reinforcement Learning Conference (RLC), Edmonton, Alberta, Canada, August 5–9, 2025.
Learning-based decision-making has the potential to enable generalizable Autonomous Driving (AD) policies, reducing the engineering overhead of rule-based approaches. Imitation Learning (IL) remains the dominant paradigm, benefiting from large-scale human demonstration datasets, but it suffers from inherent limitations such as distribution shift and imitation gaps. Reinforcement Learning (RL) presents a promising alternative, yet its adoption in AD remains limited due to the lack of standardized and efficient research frameworks. To this end, we introduce V-Max, an open research framework providing all the necessary tools to facilitate RL research for AD. V-Max is built on Waymax, a hardware-accelerated AD simulator designed for large-scale experimentation. We extend it using ScenarioNet approach, enabling the fast simulation of diverse AD datasets. V-Max integrates a set of observation and reward functions, transformer-based encoders, and training pipelines. Additionally, it includes adversarial evaluation settings and an extensive set of evaluation metrics. Through a large-scale benchmark, we investigate how network architectures, observation functions, training data, and reward shaping impact RL performance.
Valentin Charraut, Waël Doulazmi, Thomas Tournaire, and Thibault Buhet. "V-Max: A RL Framework for Autonomous Driving." Reinforcement Learning Journal, vol. TBD, 2025, pp. TBD.
BibTeX:@article{charraut2025framework,
title={{V-Max}: {A} {RL} Framework for Autonomous Driving},
author={Charraut, Valentin and Doulazmi, Wa{\"{e}}l and Tournaire, Thomas and Buhet, Thibault},
journal={Reinforcement Learning Journal},
year={2025}
}