Running GPT-OSS-120B at 500 tokens per second on Nvidia GPUs

246 points | by philipkiely 7 days ago

178 comments