view article Article How I contributed a new model to the Transformers library using Codex 6 days ago • 40
MMFace-DiT: A Dual-Stream Diffusion Transformer for High-Fidelity Multimodal Face Generation Paper • 2603.29029 • Published 6 days ago • 13
A Survey of On-Policy Distillation for Large Language Models Paper • 2604.00626 • Published 4 days ago • 7
Benchmarking and Mechanistic Analysis of Vision-Language Models for Cross-Depiction Assembly Instruction Alignment Paper • 2604.00913 • Published 4 days ago • 4
ViGoR-Bench: How Far Are Visual Generative Models From Zero-Shot Visual Reasoners? Paper • 2603.25823 • Published 10 days ago • 42
RealChart2Code: Advancing Chart-to-Code Generation with Real Data and Multi-Task Evaluation Paper • 2603.25804 • Published 10 days ago • 28
On Token's Dilemma: Dynamic MoE with Drift-Aware Token Assignment for Continual Learning of Large Vision Language Models Paper • 2603.27481 • Published 7 days ago • 35
MOOZY: A Patient-First Foundation Model for Computational Pathology Paper • 2603.27048 • Published 8 days ago • 6
ResAdapt: Adaptive Resolution for Efficient Multimodal Reasoning Paper • 2603.28610 • Published 6 days ago • 20
Diffutron: A Masked Diffusion Language Model for Turkish Language Paper • 2603.20466 • Published 16 days ago • 8
MedOpenClaw: Auditable Medical Imaging Agents Reasoning over Uncurated Full Studies Paper • 2603.24649 • Published 11 days ago • 31
Representation Alignment for Just Image Transformers is not Easier than You Think Paper • 2603.14366 • Published 21 days ago • 13
Less Gaussians, Texture More: 4K Feed-Forward Textured Splatting Paper • 2603.25745 • Published 10 days ago • 14
VFIG: Vectorizing Complex Figures in SVG with Vision-Language Models Paper • 2603.24575 • Published 11 days ago • 18
AVControl: Efficient Framework for Training Audio-Visual Controls Paper • 2603.24793 • Published 11 days ago • 26
MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data Paper • 2603.25319 • Published 10 days ago • 32
MSA: Memory Sparse Attention for Efficient End-to-End Memory Model Scaling to 100M Tokens Paper • 2603.23516 • Published about 1 month ago • 45