Rayan Hatout comments

Results 52 comments of


                                            Rayan Hatout

Optimizations in `symbolic.py`

Alright so with OpenCL I get the following results: Save roughly `80,000` calls to `isinstance` from `132,929` to `55,789` **(-58%)**. Forward pass on average **9%** faster Backward pass on average...

Optimizations in `symbolic.py`

Well this was fun, took me longer to figure out this single test case than the rest of the PR combined. Anyway it passes locally now and performance numbers are...

Optimizations in `symbolic.py`

Ok spent some time hunting exactly where things go wrong in the OpenCL test and came up with minimal example `def test_sum_num_hoisted_and_factors_cancel_out`

Optimizations in `symbolic.py`

Sick!

Remove usage of `isinstance` in the codebase + various optimizations

Not ready for review yet, the changes make `mypy` very angry + i'm getting a lot of variance on the performance improvement figures. @geohot do we have a more disciplined...

Remove usage of `isinstance` in the codebase + various optimizations

@YurySolovyov yeah we should probably add a rule for that if I manage to consistently show a perf improvement, right now it's a bit unclear @geohot my bad I hadn't...

Remove usage of `isinstance` in the codebase + various optimizations

Hey @geohot I got sidetracked and did a bunch of adhoc optimizations that shaved 300ms of runtime for LLaMa on my M1 Mac (800 -> 500). Would you mind running...

Remove usage of `isinstance` in the codebase + various optimizations

hmmm experimenting with some stuff, is it worth it to make `forward` go slower if we can make `backward` go significantly faster? Before ``` forward pass: 16.117 ms, 0.84x off...

Remove usage of `isinstance` in the codebase + various optimizations

in the general case though is that tradeoff not worth it? or is the vision that tinygrad is more about fast inference than fast training?

Remove usage of `isinstance` in the codebase + various optimizations

Huh actually i completely hallucinated the tradeoff, I hadn't re-measured "before" in a while; seems like baseline in now _way_ faster than what it was in the "after" of my...