A Batch Sequential Halving Algorithm without Performance Degradation

By Sotetsu Koyamada, Soichiro Nishimori, and Shin Ishii

Reinforcement Learning Journal, vol. 5, 2024, pp. 2218–2232.

Presented at the Reinforcement Learning Conference (RLC), Amherst Massachusetts, August 9–12, 2024.


Download:

Abstract:

In this paper, we investigate the problem of pure exploration in the context of multi-armed bandits, with a specific focus on scenarios where arms are pulled in fixed-size batches. Batching has been shown to enhance computational efficiency, but it can potentially lead to a degradation compared to the original sequential algorithm's performance due to delayed feedback and reduced adaptability. We introduce a simple batch version of the Sequential Halving (SH) algorithm (Karnin et al., 2013) and provide theoretical evidence that batching does not degrade the performance of the original algorithm under practical conditions. Furthermore, we empirically validate our claim through experiments, demonstrating the robust nature of the SH algorithm in fixed-size batch settings.


Citation Information:

Sotetsu Koyamada, Soichiro Nishimori, and Shin Ishii. "A Batch Sequential Halving Algorithm without Performance Degradation." Reinforcement Learning Journal, vol. 5, 2024, pp. 2218–2232.

BibTeX:

@article{koyamada2024batch,
    title={A Batch Sequential Halving Algorithm without Performance Degradation},
    author={Koyamada, Sotetsu and Nishimori, Soichiro and Ishii, Shin},
    journal={Reinforcement Learning Journal},
    volume={5},
    pages={2218--2232},
    year={2024}
}