HN
New
Show
Ask
Jobs
Built with Marko
Simple, zero overhead way to compress model, KV cache via Low-Rank Decomposition
1 points | by
thw20
an hour ago
No comments yet.
No comments yet.