PixelSmile: Toward Fine-Grained Facial Expression Editing Paper • 2603.25728 • Published 3 days ago • 110
6Bit-Diffusion: Inference-Time Mixed-Precision Quantization for Video Diffusion Models Paper • 2603.18742 • Published 10 days ago • 4
UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation Paper • 2603.23500 • Published 5 days ago • 34
TrajLoom: Dense Future Trajectory Generation from Video Paper • 2603.22606 • Published 6 days ago • 5
Scaling DoRA: High-Rank Adaptation via Factored Norms and Fused Kernels Paper • 2603.22276 • Published 6 days ago • 13
Manifold-Aware Exploration for Reinforcement Learning in Video Generation Paper • 2603.21872 • Published 6 days ago • 33
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model Paper • 2603.21986 • Published 6 days ago • 116
Versatile Editing of Video Content, Actions, and Dynamics without Training Paper • 2603.17989 • Published 11 days ago • 16
Do VLMs Need Vision Transformers? Evaluating State Space Models as Vision Encoders Paper • 2603.19209 • Published 10 days ago • 5
3DreamBooth: High-Fidelity 3D Subject-Driven Video Generation Model Paper • 2603.18524 • Published 10 days ago • 58
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published 12 days ago • 134
MiroThinker-1.7 & H1: Towards Heavy-Duty Research Agents via Verification Paper • 2603.15726 • Published 13 days ago • 181
ViFeEdit: A Video-Free Tuner of Your Video Diffusion Transformer Paper • 2603.15478 • Published 13 days ago • 24
OmniForcing: Unleashing Real-time Joint Audio-Visual Generation Paper • 2603.11647 • Published 17 days ago • 31
HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images Paper • 2603.02210 • Published 27 days ago • 29