StefanS-prog
Results
1
issues of
StefanS-prog
**Describe the bug** The gradient accumulation implementation with mx.eval causes a sudden high memory usage of > 10 GB. Before mx.eval it is around 600 MB which is reasonable. **To...