Google's TurboQuant AI-compression algorithm can reduce LLM memory usage by 6x

16 points | by gmays a day ago

2 comments