nimlgen
nimlgen
to test how slow and what we can add to ci
new: ``` STEPS=700 BS=512 GPUS=1 TARGET_EVAL_ACC_PCT=93.5 DEBUG=0 python3 examples/hlb_cifar10.py memory reduced from 201.00 MB to 198.60 MB shuffling training dataset in 1101.67 ms (epoch=0) memory reduced from 1974.03 MB to...
This approach allocates one big buffer and `offsets` it for other buffers. So that's basically an offline dynamic storage allocation problem. I will play with this more, but I think...
Dsp progress