WorldGen: From Text to Traversable and Interactive 3D Worlds Paper • 2511.16825 • Published Nov 20, 2025 • 23
STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flow Paper • 2511.20462 • Published Nov 25, 2025 • 31
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens Paper • 2511.19418 • Published Nov 24, 2025 • 28
Latent Diffusion Model without Variational Autoencoder Paper • 2510.15301 • Published Oct 17, 2025 • 49
LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training Paper • 2509.23661 • Published Sep 28, 2025 • 47
A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learning Paper • 2509.15937 • Published Sep 19, 2025 • 20
The Well: a Large-Scale Collection of Diverse Physics Simulations for Machine Learning Paper • 2412.00568 • Published Nov 30, 2024 • 23
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10, 2025 • 190
SPATIALGEN: Layout-guided 3D Indoor Scene Generation Paper • 2509.14981 • Published Sep 18, 2025 • 27
Symbolic Graphics Programming with Large Language Models Paper • 2509.05208 • Published Sep 5, 2025 • 46
ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning Paper • 2507.16815 • Published Jul 22, 2025 • 39