F-GRPO: Don’t Let Your Policy Learn the Obvious and Forget the Rare | Daniil Plyusov et al.

F-GRPO: Don’t Let Your Policy Learn the Obvious and Forget the Rare | Daniil Plyusov et al. | ResearchPod