MolmoMotion: Forecasting Point Trajectories in 3D with Language Instruction Paper • 2606.18558 • Published 10 days ago • 53
Qwen-AgentWorld: Language World Models for General Agents Paper • 2606.24597 • Published 4 days ago • 130
The Verification Horizon: No Silver Bullet for Coding Agent Rewards Paper • 2606.26300 • Published 3 days ago • 36
Wan-Streamer v0.1: End-to-end Real-time Interactive Foundation Models Paper • 2606.25041 • Published 4 days ago • 84
ImageWAM: Do World Action Models Really Need Video Generation, or Just Image Editing? Paper • 2606.19531 • Published 10 days ago • 20
Holistic Data Scheduler for LLM Pre-training via Multi-Objective Reinforcement Learning Paper • 2606.24133 • Published 4 days ago • 8
Autodata: An agentic data scientist to create high quality synthetic data Paper • 2606.25996 • Published 3 days ago • 9
UnityShots: Memory-Driven Multi-Shot Audio-Video Generation with Boundary-Aware Gating Paper • 2606.21661 • Published 8 days ago • 24
WorldMemArena: Evaluating Multimodal Agent Memory Through Action-World Interaction Paper • 2605.29341 • Published about 1 month ago • 18
Imaginative Perception Tokens Enhance Spatial Reasoning in Multimodal Language Models Paper • 2606.03988 • Published 24 days ago • 126
WorldAct: Activating Monolithic 3D Worlds into Interactive-Ready Object-Centric Scenes Paper • 2605.15843 • Published May 15 • 6
TideGS: Scalable Training of Over One Billion 3D Gaussian Splatting Primitives via Out-of-Core Optimization Paper • 2605.20150 • Published May 19 • 7
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation Paper • 2605.18739 • Published May 18 • 116