Multi-view Disentanglement for Reinforcement Learning with Multiple Cameras

By Mhairi Dunion, and Stefano V Albrecht

Reinforcement Learning Journal, vol. 2, 2024, pp. 498–515.

Presented at the Reinforcement Learning Conference (RLC), Amherst Massachusetts, August 9–12, 2024.

Download:

Abstract:

The performance of image-based Reinforcement Learning (RL) agents can vary depending on the position of the camera used to capture the images. Training on multiple cameras simultaneously, including a first-person egocentric camera, can leverage information from different camera perspectives to improve the performance of RL. However, hardware constraints may limit the availability of multiple cameras in real-world deployment. Additionally, cameras may become damaged in the real-world preventing access to all cameras that were used during training. To overcome these hardware constraints, we propose Multi-View Disentanglement (MVD), which uses multiple cameras to learn a policy that is robust to a reduction in the number of cameras to generalise to any single camera from the training set. Our approach is a self-supervised auxiliary task for RL that learns a disentangled representation from multiple cameras, with a shared representation that is aligned across all cameras to allow generalisation to a single camera, and a private representation that is camera-specific. We show experimentally that an RL agent trained on a single third-person camera is unable to learn an optimal policy in many control tasks; but, our approach, benefiting from multiple cameras during training, is able to solve the task using only the same single third-person camera.

Citation Information:

Mhairi Dunion and Stefano V Albrecht. "Multi-view Disentanglement for Reinforcement Learning with Multiple Cameras." Reinforcement Learning Journal, vol. 2, 2024, pp. 498–515.

BibTeX:

@article{dunion2025multi,
    title={Multi-view Disentanglement for Reinforcement Learning with Multiple Cameras},
    author={Dunion, Mhairi and Albrecht, Stefano V},
    journal={Reinforcement Learning Journal},
    volume={2},
    pages={498--515},
    year={2025}
}

Multi-view Disentanglement for Reinforcement Learning with Multiple Cameras

By Mhairi Dunion, and Stefano V Albrecht

Download: Paper

Abstract:

Citation Information:

Download: