KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction Paper • 2505.23416 • Published May 29 • 11
Dataset Condensation via Efficient Synthetic-Data Parameterization Paper • 2205.14959 • Published May 30, 2022
GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance Paper • 2505.07004 • Published May 11 • 7
LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging Paper • 2406.12837 • Published Jun 18, 2024
Efficient Latency-Aware CNN Depth Compression via Two-Stage Dynamic Programming Paper • 2301.12187 • Published Jan 28, 2023 • 1