Data-Efficient RLVR via Off-Policy Influence Guidance Paper • 2510.26491 • Published Oct 30, 2025 • 10
MAPS: Advancing Multi-Modal Reasoning in Expert-Level Physical Science Paper • 2501.10768 • Published Jan 18, 2025