tinygrad icon indicating copy to clipboard operation
tinygrad copied to clipboard

Make efficientnet trainer fast

Open geohot opened this issue 2 years ago • 0 comments

  • [x] Support asymmetric padding
  • [ ] Support caching of only unresolved LazyBuffers See CACHE_LAZYBUFFERS
  • [ ] Support not running the convs multiple times, aka late split. This fixes an openpilot runner bug too
# <enum 'ProcessingOps'> 70
GRAPH=1 CNT=1 GPU=1 PYTHONPATH="." python3 examples/benchmark_train_efficientnet.py
# <enum 'ProcessingOps'> 208
GRAPH=1 CNT=1 LAZY=1 PYTHONPATH="." python3 examples/benchmark_train_efficientnet.py
  • [x] Support counting FLOPS and work out theoretical max
  • [x] "Upstream" LAZY and make it default
  • [ ] Remove padding from the ml conv op and remove the slice on execution?
  • [ ] Collapse reduce into binary op
  • [ ] Improve the "JIT" to not constant fold things like LR.

geohot avatar Jun 25 '22 06:06 geohot