MiniMax teased M3 Sparse Attention: 9.7x prefilling, 15.6x decoding at 1M

8 points | by rebekkamikkoa 21 hours ago

No comments yet.