Weigao Sun's picture

Weigao Sun

weigao266

·

https://weigao266.github.io/

AI & ML interests

Algo & MLSys

Organizations

upvoted a paper 3 months ago

Native Hybrid Attention for Efficient Sequence Modeling

Paper • 2510.07019 • Published Oct 8, 2025 • 16

upvoted 2 papers 4 months ago

Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration

Paper • 2509.14760 • Published Sep 18, 2025 • 53

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Paper • 2508.14444 • Published Aug 20, 2025 • 39

upvoted 4 papers 5 months ago

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21, 2025 • 259

CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language Models

Paper • 2505.20767 • Published May 27, 2025 • 1

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

Paper • 2508.09834 • Published Aug 13, 2025 • 53

Mixture-of-Recursions: Learning Dynamic Recursive Depths for Adaptive Token-Level Computation

Paper • 2507.10524 • Published Jul 14, 2025 • 70

upvoted 2 papers 6 months ago

Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers

Paper • 2506.23918 • Published Jun 30, 2025 • 89

IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction

Paper • 2507.02025 • Published Jul 2, 2025 • 35

upvoted 5 papers 7 months ago

LongLLaDA: Unlocking Long Context Capabilities in Diffusion LLMs

Paper • 2506.14429 • Published Jun 17, 2025 • 44

Unfolding Spatial Cognition: Evaluating Multimodal Models on Visual Simulations

Paper • 2506.04633 • Published Jun 5, 2025 • 19

Advancing Multimodal Reasoning: From Optimized Cold Start to Staged Reinforcement Learning

Paper • 2506.04207 • Published Jun 4, 2025 • 48

SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published Jun 2, 2025 • 147

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Paper • 2505.22617 • Published May 28, 2025 • 131

upvoted 2 papers 8 months ago

Learn to Reason Efficiently with Adaptive Length-based Reward Shaping

Paper • 2505.15612 • Published May 21, 2025 • 34

Accelerate TarFlow Sampling with GS-Jacobi Iteration

Paper • 2505.12849 • Published May 19, 2025 • 7

upvoted a collection 8 months ago

Liger

6 items • Updated Mar 20, 2025 • 3

upvoted a paper 9 months ago

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

Paper • 2503.21614 • Published Mar 27, 2025 • 42

upvoted a collection 10 months ago

MoM

9 items • Updated Mar 18, 2025 • 2

upvoted a paper 10 months ago

Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts

Paper • 2503.05447 • Published Mar 7, 2025 • 8