Understanding and Harnessing Sparsity in Unified Multimodal Models Paper • 2512.02351 • Published 7 days ago • 1
Understanding and Harnessing Sparsity in Unified Multimodal Models Paper • 2512.02351 • Published 7 days ago • 1 • 2
MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation Paper • 2511.09611 • Published 26 days ago • 68
When Visualizing is the First Step to Reasoning: MIRA, a Benchmark for Visual Chain-of-Thought Paper • 2511.02779 • Published Nov 4 • 57
SRPO: Enhancing Multimodal LLM Reasoning via Reflection-Aware Reinforcement Learning Paper • 2506.01713 • Published Jun 2 • 48
Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers Paper • 2410.13184 • Published Oct 17, 2024 • 3
Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts Paper • 2503.05066 • Published Mar 7 • 4
Capacity-Aware Inference: Mitigating the Straggler Effect in Mixture of Experts Paper • 2503.05066 • Published Mar 7 • 4 • 2
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 429
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model Paper • 2405.04434 • Published May 7, 2024 • 24