Matrix-vector multiplication implemented in off-the-shelf DRAM for Low-Bit LLMs

230 points | by cpldcpu 4 months ago

55 comments