Context-Aware RL for Agentic and Multimodal LLMs | ResearchPod