ocannl Alternatives to dynamic indexing? Generically optimize one-hot encoding pattern for embeddings in low

I'm building an example that requires dynamic indexing but I noticed that OCANNL doesn't support it. I could work around it with one-hots, but just wondering if there are other more efficient alternatives that I might be missing?

Aug 11 '25 06:08 derekchiang

I removed dynamic indexing because it was complicating shape inference and tinygrad did not support it either (at the time, but I'm guessing still). In the meantime, shape inference got so complicated that reintroducing dynamic indexing would be a drop in the bucket. However, dynamic indexing would make optimization hard (especially later on for OCANNL versions 0.7, 0.8). How complex is the indexing you're after? Maybe we could add something like a virtual one-hot vector construct or an extension to the einsum-style specifications.

Aug 11 '25 16:08 lukstafi

Not that complex I think -- I'm just implementing Karpathy's makemore and the dynamic indexing is the emb = C[X] part.

Aug 12 '25 05:08 derekchiang

There is still a glaring bug around, I'm triaging and sprinting to fix the worst things for a 0.6.0 release very soon. It looks to me that tinygrad is using one-hot encodings: https://github.com/tinygrad/tinygrad/blob/master/tinygrad/nn/init.py#L325

Aug 12 '25 05:08 lukstafi

It looks pretty trivial to detect and optimize the one-hot embedding pattern into a dynamic index at the low-level internal representation simplification stage. This requires extending Low_level.t but no changes outside low_level.ml.

Aug 12 '25 05:08 lukstafi

It looks pretty trivial

Alright save this issue for me :)

How would you detect it though? Unless we literally loop over the tensor data to decide whether it's a one-hot, wouldn't you have to annotate the one-hot at a higher compilation stage?

Aug 12 '25 06:08 derekchiang

I responded too quickly, it is not trivial in fact because it also requires adding dynamic indexing to indexing.ml .

Aug 12 '25 06:08 lukstafi

How would you detect it though?

Detect a for-loop containing a summation accumulating assignment with a reduction (matrix multiplication), whose one side is equality between Embed_index and Get.

(Edit: apologies, no Where operation in here.)

Aug 12 '25 06:08 lukstafi

I think an embedding operation can be already implemented using the range operation with the vocabulary size as non-tensor input. But with the shape inference machinery, I'd like to have an embedding operation defined in operation.ml that does not take the vocabulary size as input, but adapts to it based on tensor shapes. This is unfortunately not yet possible but it should be possible, I just need to add the right missing piece. (There are a couple ways of getting there and we want one with a good generality-simplicity profile.)

Or maybe it's already possible with a combination of range_over_offsets and an einsum spec.

Aug 12 '25 06:08 lukstafi

Alternatives to dynamic indexing? Generically optimize one-hot encoding pattern for embeddings in low_level.ml