-
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 491 -
Cache-to-Cache: Direct Semantic Communication Between Large Language Models
Paper • 2510.03215 • Published • 97 -
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs
Paper • 2510.07499 • Published • 48 -
StreamingVLM: Real-Time Understanding for Infinite Video Streams
Paper • 2510.09608 • Published • 50
Jiwon Song
jiwonsong
AI & ML interests
AI Compression & Acceleration
Recent Activity
upvoted
a
paper
about 1 month ago
Retrospective Sparse Attention for Efficient Long-Context Generation
updated
a collection
about 1 month ago
read
upvoted
a
paper
about 1 month ago
Kimi Linear: An Expressive, Efficient Attention Architecture