micrograd Big speedup

Big speedup

Open panzi opened this issue 1 year ago • 1 comments

With these changes I could speed up execution a lot. On my machine instead of over 2 minutes it now takes about 14 seconds (including saving the plot as SVG, 12 seconds without that).

The changes are: Because the expression tree is always exactly the same (when not batching) it can be re-used and instead of re-creating the whole tree every time the new refresh() method recalculates all the values in-place. This method also sets grad to 0.0 in every node, so zero_grad() isn't needed in the optimization loop anymore.

Note that for this improvement to work with batching you need a little bit more preparation. That is, you need to create Xb and yb dummy lists of batch size containing Value objects and use these to setup the calculations before the optimization loop. In the loss() function you then need make the random index selection and update the value.data attributes of all the objects in Xb and yb before running refresh(). ~~I didn't include an example for that at the moment.~~ I also added code to the demo that demonstrates batching.

Oct 19 '23 17:10 panzi

Good idea but it also makes this project less intuitive with non-essential optimizations

Nov 10 '23 16:11 ethanyanjiali

micrograd micrograd copied to clipboard

Big speedup

micrograd
micrograd copied to clipboard