ELT: Elastic Looped Transformers for Visual Generation Paper • 2604.09168 • Published 5 days ago • 17
RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details Paper • 2604.06870 • Published 7 days ago • 37
Matrix-Game 3.0: Real-Time and Streaming Interactive World Model with Long-Horizon Memory Paper • 2604.08995 • Published 5 days ago • 41
WildDet3D: Scaling Promptable 3D Detection in the Wild Paper • 2604.08626 • Published 6 days ago • 224
SkillClaw: Let Skills Evolve Collectively with Agentic Evolver Paper • 2604.08377 • Published 6 days ago • 273
Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision Paper • 2604.04934 • Published 9 days ago • 42
TriAttention: Efficient Long Reasoning with Trigonometric KV Compression Paper • 2604.04921 • Published 9 days ago • 106
GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation Paper • 2603.26661 • Published 18 days ago • 25
Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models Paper • 2603.25750 • Published 26 days ago • 36
4DGS360: 360° Gaussian Reconstruction of Dynamic Objects from a Single Video Paper • 2603.21618 • Published 23 days ago • 15
Toward Physically Consistent Driving Video World Models under Challenging Trajectories Paper • 2603.24506 • Published 20 days ago • 6
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Paper • 2603.19220 • Published 26 days ago • 66
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published 28 days ago • 138
WorldCam: Interactive Autoregressive 3D Gaming Worlds with Camera Pose as a Unifying Geometric Representation Paper • 2603.16871 • Published 28 days ago • 60
DVD: Deterministic Video Depth Estimation with Generative Priors Paper • 2603.12250 • Published Mar 12 • 26
Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence Paper • 2603.07660 • Published Mar 8 • 86
LoGeR: Long-Context Geometric Reconstruction with Hybrid Memory Paper • 2603.03269 • Published Mar 3 • 63
Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model Paper • 2603.05438 • Published Mar 5 • 40