chenyu
chenyu
#3271 and beam searching resnet for example - [x] assert if index > int32 #4157 - [ ] fix linearizer and check index max, use int64 if needed. assert if...
previously needed due to kernel buffer count limit, seems fine to remove now
this matches rsqrt backward at 0 to torch. also added exact tests for reciprocal, sqrt, and rsqrt see https://pytorch.org/docs/stable/notes/autograd.html#gradients-for-non-differentiable-functions for how torch defines derivative at these points
example usage from discord ``` for t in range(0, inp_seq_len): model(X[:, t].to(device)) y_pred = torch.zeros(Y.size()).to(device) for i in range(0, out_seq_len): y_pred[:, i] = model() loss = criterion(y_pred.to(device), Y.to(device)) loss.backward() clip_grads(model)...
targeting early June - [ ] device test plans - [ ] onboarding - [ ] LLM 8B 200+ tok/s - [ ] 70B llama works - [ ] mixtral...
targeting end of May - [ ] docs - [ ] setitem #4574 - [ ] `pip install tinygrad` well tested - [ ] all tests pass locally on tinyboxes...
example with rewriting `sparse_categorical_crossentropy` using getitem ``` from tinygrad import Tensor, GlobalCounters X = Tensor.rand(256, 1000).realize() Y = Tensor.randint(256, low=0, high=10).realize() GlobalCounters.reset() X.sparse_categorical_crossentropy(Y, label_smoothing=0.1).realize() print(f"{GlobalCounters.global_ops=}, {GlobalCounters.global_mem=}, {GlobalCounters.kernel_count=}") def scc2(self, Y:Tensor,...
should be able to debug nan loss and inf output
``` FAILED test/test_ops.py::TestOps::test_fancy_conv2d - AssertionError: invalid shrink ((0, 2), (0, 3), (0, 9), (0, 26)) for (2, 3, 8, 28) FAILED test/test_ops.py::TestOps::test_nested_conv2d - Exception: backward pass tensor 0 failed shape...
repro ``` from tinygrad import Tensor, TinyJit, Variable @TinyJit def f(t): return t + 1 for i in range(1, 5): vi = Variable("i", 1, 10).bind(i) t = f(Tensor.rand(i).reshape(vi)) print(f"{t.shape=}") ```...