O(1) Memory Neural Network Training with Reversible Architectures

3 points | by amazedsaint 21 hours ago

1 comments

amazedsaint 21 hours ago
Proposing a new architecture for O(1) memory neural network training. ZeroActivation enables training of arbitrarily deep neural networks with constant memory usage, regardless of depth. Perfect for Apple Silicon (M1/M2) users and anyone looking to train impossibly deep models. The kicker is a new model I developed for Reverse SSA, more on that soon - Thanks