Fast KV Compaction via Attention Matching

by Adam Zweiger et al.

Feb 20, 202607:49

KV Cache CompactionAttention MatchingLatent Space CompactionPareto Frontier of Speed-Quality
00:0007:49
Download on the App Store

Get the full experience with ResearchPod

ResearchPod turns research papers into podcasts you can actually follow.