Which Experiences Are Influential for RL Agents? Efficiently Estimating The Influence of Experiences

By Takuya Hiraoka, Takashi Onishi, Guanquan Wang, and Yoshimasa Tsuruoka

Reinforcement Learning Journal, vol. TBD, 2025, pp. TBD.

Presented at the Reinforcement Learning Conference (RLC), Edmonton, Alberta, Canada, August 5–9, 2025.


Download:

Abstract:

In reinforcement learning (RL) with experience replay, experiences stored in a replay buffer influence the RL agent's performance. Information about how these experiences influence the agent's performance is valuable for various purposes, such as identifying experiences that negatively influence underperforming agents. One method for estimating the influence of experiences is the leave-one-out (LOO) method. However, this method is usually computationally prohibitive. In this paper, we present Policy Iteration with Turn-over Dropout (PIToD), which efficiently estimates the influence of experiences. We evaluate how correctly PIToD estimates the influence of experiences and its efficiency compared to LOO. We then apply PIToD to amend underperforming RL agents, i.e., we use PIToD to estimate negatively influential experiences for the RL agents and to delete the influence of these experiences. We show that RL agents' performance is significantly improved via amendments with PIToD. Our code is available at: https://github.com/TakuyaHiraoka/Which-Experiences-Are-Influential-for-RL-Agents


Citation Information:

Takuya Hiraoka, Takashi Onishi, Guanquan Wang, and Yoshimasa Tsuruoka. "Which Experiences Are Influential for RL Agents? Efficiently Estimating The Influence of Experiences." Reinforcement Learning Journal, vol. TBD, 2025, pp. TBD.

BibTeX:
@article{hiraoka2025which,
    title={Which Experiences Are Influential for {RL} Agents? {E}fficiently Estimating The Influence of Experiences},
    author={Hiraoka, Takuya and Onishi, Takashi and Wang, Guanquan and Tsuruoka, Yoshimasa},
    journal={Reinforcement Learning Journal},
    year={2025}
}