The official datasets and model checkpoints of AEPO
-
Agentic Entropy-Balanced Policy Optimization
Paper • 2510.14545 • Published • 104 -
dongguanting/Qwen3-8B-AEPO-DeepSearch
Text Generation • 8B • Updated • 10 • 1 -
dongguanting/Qwen3-14B-AEPO-DeepSearch
Robotics • 15B • Updated • 8 • 1 -
dongguanting/Qwen2.5-7B-AEPO
Text Generation • 8B • Updated • 22 • 1