Zone of Proximal Policy Optimization: Teacher in Prompts, Not Gradients Paper • 2606.18216 • Published 5 days ago • 54
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published 25 days ago • 93
view article Article Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion Language Models nvidia • 29 days ago • 34
view article Article Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion Language Models nvidia • 29 days ago • 34
Nemotron-Labs-Diffusion Collection A Tri-Mode Language Model Family Unifying Autoregressive, Diffusion, and Self-Speculation Decoding • 7 items • Updated 9 days ago • 49