Jason Ansel

Results 199 comments of Jason Ansel

Thanks! The yindex/xindex naming is artifact of `torch.compile`'s Triton codegen and how we map to GPU grids there. I'll swap it for Halide output, though that shouldn't matter in this...

Makes sense. I am hitting this error somewhat frequently, so a fix would be very helpful! If there is a way to get 64-bit indexing that might also fix it...

I'm kind of surprised we didn't find this earlier... won't it just result in a recompile loop until we hit the cache limit?

Since guards are always attached to a specific (fixed) code object, `co_varnames`/`co_localsplusnames` is just a constant. Therefor, we could legally generate a guard check like `fast_locals[2] == Py_NONE`, with the...

You can specify `--database=` or `args.database` to tell it what filename to use. Looks like it didn't have write access to the default one.

This is very cool and I love the approach. Good work! I think the biggest challenge will be coming up with the right heuristics of when to apply this. There...

I think starting with a simple heuristic makes sense, and perhaps some config to force-enable it. Hopefully we can find a robust heuristic. If you want to try out heuristics,...

No, but I agree we need that.

I think @bertmaher mentioned someone on his team would add the cooperative launch and `tl.atomic_load(ptr, sem="relaxed")` to Triton. It may make sense to delay turning this on by default until...