DeepCubeAF: A Foundation Model for Generalizable Pathfinding Heuristics

By Vedant Khandelwal, Amit Sheth, and Forest Agostinelli

Reinforcement Learning Journal, vol. TBD, 2025, pp. TBD.

Presented at the Reinforcement Learning Conference (RLC), Edmonton, Alberta, Canada, August 5–9, 2025.

Download: Note: Paper unavailable until authors provide the signed publication agreement.

Abstract:

Pathfinding problems can be found in fields such as robotics, mathematics, chemistry, and program synthesis, where the objective of pathfinding is to find a sequence of actions that transforms a given start state into a goal state. Recently, deep reinforcement learning (DRL) has emerged as a promising method for automatically training domain-specific heuristic functions to solve these problems in a largely domain-independent fashion. However, these approaches often require retraining for even a slight change in domain, resulting in significant resource and time inefficiencies. While existing approaches use supervised learning to learn generalizable heuristics to handle unseen domains, they are limited by the need to obtain supervised labels. We draw inspiration from domain randomization in reinforcement learning to handle these limitations and the DeepCubeA algorithm and introduce DeepCubeA for foundation models (DeepCubeAF). DeepCubeAF trains a heuristic function across randomly generated domains using reinforcement learning and uses this trained heuristic function with batch weighted A* search to solve problems. Our model consistently shows better generalizability than the existing foundation model for both seen and unseen domains. This work represents a step toward training robust, generalizable models and providing access to these models to experts across various fields.

Citation Information:

Vedant Khandelwal, Amit Sheth, and Forest Agostinelli. "DeepCubeAF: A Foundation Model for Generalizable Pathfinding Heuristics." Reinforcement Learning Journal, vol. TBD, 2025, pp. TBD.

BibTeX:

@article{khandelwal2025deepcubeaf,
    title={{DeepCubeAF}: {A} Foundation Model for Generalizable Pathfinding Heuristics},
    author={Khandelwal, Vedant and Sheth, Amit and Agostinelli, Forest},
    journal={Reinforcement Learning Journal},
    year={2025}
}