MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling Paper • 2511.11793 • Published Nov 14, 2025 • 165
Scaling Agent Learning via Experience Synthesis Paper • 2511.03773 • Published Nov 5, 2025 • 81
view article Article ViDoRe V3: a comprehensive evaluation of retrieval for enterprise use-cases Nov 5, 2025 • 57
CoDA: Agentic Systems for Collaborative Data Visualization Paper • 2510.03194 • Published Oct 3, 2025 • 28
BrowseComp-Plus: A More Fair and Transparent Evaluation Benchmark of Deep-Research Agent Paper • 2508.06600 • Published Aug 8, 2025 • 41
view article Article Introducing RTEB: A New Standard for Retrieval Evaluation +4 Oct 1, 2025 • 132
view article Article Smol2Operator: Post-Training GUI Agents for Computer Use +3 Sep 23, 2025 • 134
MM-BrowseComp: A Comprehensive Benchmark for Multimodal Browsing Agents Paper • 2508.13186 • Published Aug 14, 2025 • 19