Writing Speed-of-Light Flash Attention for 5090 in CUDA C++

158 points | by dsr12 3 days ago

34 comments