Cutting inference cold starts by 40x with LP, FUSE, C/R, and CUDA-checkpoint

65 points | by charles_irl 6 hours ago

15 comments