-
RAGEN / StarPO: https://arxiv.org/abs/2504.20073
-
MUA-RL: https://arxiv.org/abs/2508.18669
-
UGST / Goal Alignment: https://arxiv.org/abs/2507.20152
-
SimulatorArena: https://arxiv.org/abs/2510.05444
-
Persona Simulator RL: https://arxiv.org/abs/2511.00222
-
SWEET-RL / ColBench: https://arxiv.org/abs/2503.15478
-
Turn-Level Rewards: https://arxiv.org/abs/2505.11821
-
τ2-bench: https://arxiv.org/abs/2506.07982
-
AgentGym-RL: https://arxiv.org/abs/2509.08755
-
AgentRL: https://arxiv.org/abs/2510.04206
-
Agent-R1: https://arxiv.org/abs/2511.14460
-
RAGEN-2: https://arxiv.org/abs/2604.06268
-
ProRL Agent: https://arxiv.org/abs/2603.18815
-
LOOP / Long-Horizon Interactive LLM Agents: https://arxiv.org/abs/2502.01600
-
RAGEN GitHub: https://github.com/RAGEN-AI/RAGEN
-
AgentRL GitHub: https://github.com/THUDM/AgentRL
-
Agent-R1 GitHub: https://github.com/AgentR1/Agent-R1
-
AgentGym-RL project: https://AgentGym-RL.github.io