OpenVoxel: Training-Free Grouping and Captioning Voxels for Open-Vocabulary 3D Scene Understanding Paper • 2601.09575 • Published 2 days ago • 22
Fast-ThinkAct: Efficient Vision-Language-Action Reasoning via Verbalizable Latent Planning Paper • 2601.09708 • Published 1 day ago • 41
4D-RGPT: Toward Region-level 4D Understanding via Perceptual Distillation Paper • 2512.17012 • Published 29 days ago • 43