Matrix-vector multiplication implemented in off-the-shelf DRAM for Low-Bit LLMs

175 points | by cpldcpu 16 hours ago

43 comments