MegaTrain: Full Precision Training of 100B+ Parameter LLMs on a Single GPU

222 points | by chrsw 8 hours ago

42 comments