DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation Paper • 2511.06307 • Published Nov 9, 2025 • 51
LaSeR: Reinforcement Learning with Last-Token Self-Rewarding Paper • 2510.14943 • Published Oct 16, 2025 • 39
AutoCodeBench: Large Language Models are Automatic Code Benchmark Generators Paper • 2508.09101 • Published Aug 12, 2025 • 8