hustvl/vgt_internvl3_1_6B_pretrain
Text-to-Image • Updated
None defined yet.
DiffusionVL: Translating Any Autoregressive Models into Diffusion Vision Language Models
InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models