3 27 12

Jiwon Song

jiwonsong

AI & ML interests

AI Compression & Acceleration

Recent Activity

upvoted a paper about 1 month ago

Retrospective Sparse Attention for Efficient Long-Context Generation

updated a collection about 1 month ago

read

upvoted a paper about 1 month ago

Kimi Linear: An Expressive, Efficient Attention Architecture

View all activity

Organizations

upvoted 4 papers about 1 month ago

upvoted 2 papers about 2 months ago

Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning

Paper • 2510.19338 • Published Oct 22 • 114

LiteStage: Latency-aware Layer Skipping for Multi-stage Reasoning

Paper • 2510.14211 • Published Oct 16 • 7

upvoted a collection about 2 months ago

Speculator Models

Collection

13 items • Updated 4 days ago • 5

upvoted a paper 3 months ago

QWHA: Quantization-Aware Walsh-Hadamard Adaptation for Parameter-Efficient Fine-Tuning on Large Language Models

Paper • 2509.17428 • Published Sep 22 • 9

upvoted 2 papers 4 months ago

Don't Overthink It: A Survey of Efficient R1-style Large Reasoning Models

Paper • 2508.02120 • Published Aug 4 • 19

Mixture of Scales: Memory-Efficient Token-Adaptive Binarization for Large Language Models

Paper • 2406.12311 • Published Jun 18, 2024 • 8

upvoted 3 papers 5 months ago

Reasoning Path Compression: Compressing Generation Trajectories for Efficient LLM Reasoning

Paper • 2505.13866 • Published May 20 • 17

MemOS: A Memory OS for AI System

Paper • 2507.03724 • Published Jul 4 • 157

Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search

Paper • 2507.02652 • Published Jul 3 • 26

upvoted 3 papers 6 months ago

LongWriter-Zero: Mastering Ultra-Long Text Generation via Reinforcement Learning

Paper • 2506.18841 • Published Jun 23 • 56

SeerAttention-R: Sparse Attention Adaptation for Long Reasoning

Paper • 2506.08889 • Published Jun 10 • 23

The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity

Paper • 2506.06941 • Published Jun 7 • 15

upvoted 2 papers 7 months ago

Thinkless: LLM Learns When to Think

Paper • 2505.13379 • Published May 19 • 50

Think Only When You Need with Large Hybrid-Reasoning Models

Paper • 2505.14631 • Published May 20 • 20

upvoted 2 papers 10 months ago

LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention

Paper • 2502.14866 • Published Feb 20 • 13

Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?

Paper • 2502.12215 • Published Feb 17 • 16

Jiwon Song

AI & ML interests

Recent Activity

Organizations

jiwonsong's activity