arxiv:2509.16197
Haotian Zhang
haotiz
AI & ML interests
Vision and Language
Recent Activity
liked
a dataset
1 day ago
nvidia/PhysicalAI-Autonomous-Vehicle-Cosmos-Drive-Dreams
upvoted
a
paper
2 months ago
Ferret-UI Lite: Lessons from Building Small On-Device GUI Agents
authored
a paper
3 months ago
MANZANO: A Simple and Scalable Unified Multimodal Model with a Hybrid
Vision Tokenizer