Matrix-vector multiplication implemented in off-the-shelf DRAM for Low-Bit LLMs

230 points | by cpldcpu a year ago

55 comments