From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning | Chao Chen et al. | ResearchPod