A Gradient Perspective on RLVR Stability and Winner Advantage Policy Optimization Paper • 2606.16154 • Published 19 days ago • 8
A Gradient Perspective on RLVR Stability and Winner Advantage Policy Optimization Paper • 2606.16154 • Published 19 days ago • 8
A Gradient Perspective on RLVR Stability and Winner Advantage Policy Optimization Paper • 2606.16154 • Published 19 days ago • 8
Agentic Monte Carlo: Simulating Reinforcement Learning for Black-Box Agents Paper • 2606.05296 • Published about 1 month ago • 10
Inconsistencies In Consistency Models: Better ODE Solving Does Not Imply Better Samples Paper • 2411.08954 • Published Nov 13, 2024 • 11
Agentic Monte Carlo: Simulating Reinforcement Learning for Black-Box Agents Paper • 2606.05296 • Published about 1 month ago • 10
Response Quality Assessment for Retrieval-Augmented Generation via Conditional Conformal Factuality Paper • 2506.20978 • Published Jun 26, 2025 • 1
RankJudge: A Multi-Turn LLM-as-a-Judge Synthetic Benchmark Generator Paper • 2605.21748 • Published May 20 • 17
Chartographer: Counterfactual Chart Generation for Evaluating Vision-Language Models Paper • 2605.27311 • Published May 26 • 3
Beyond Procedure: Substantive Fairness in Conformal Prediction Paper • 2602.16794 • Published Feb 18 • 1
On the Burden of Achieving Fairness in Conformal Prediction Paper • 2605.14260 • Published May 15 • 1
On the Burden of Achieving Fairness in Conformal Prediction Paper • 2605.14260 • Published May 15 • 1
Beyond Procedure: Substantive Fairness in Conformal Prediction Paper • 2602.16794 • Published Feb 18 • 1
Response Quality Assessment for Retrieval-Augmented Generation via Conditional Conformal Factuality Paper • 2506.20978 • Published Jun 26, 2025 • 1