MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling Paper • 2511.11793 • Published Nov 14, 2025 • 165
FlashAdventure: A Benchmark for GUI Agents Solving Full Story Arcs in Diverse Adventure Games Paper • 2509.01052 • Published Sep 1, 2025 • 21