DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research Paper • 2511.19399 • Published Nov 24, 2025 • 60
nvidia/Llama-Nemotron-Post-Training-Dataset Viewer • Updated May 8, 2025 • 3.91M • 5.25k • 635
rl-rag/rar_cb_bs_16_rollout_8__1__1759453746_checkpoints_step_100 333k • Updated Oct 11, 2025 • 5