Oguz Ulgen
Oguz Ulgen
The memory regression is not related to reify and/or jit/interp. I simplified your above code and executed with and without reify keyword, as well as, trying with and without jit....
@jlebar I don't actually have a preference which one we choose, as long as we are consistent. Would you be willing to put a PR to set the flag? And...
While playing with toy examples, I have seen 3 combinations. ``` @triton.jit def kernel_with_label( in_ptr0, in_ptr1, out_ptr, n_elements, BLOCK_SIZE: "tl.constexpr", ): pid = tl.program_id(axis=0) if pid > 1: return block_start...
There's also some inconsistency that I am not exactly certain where it comes from but printing the exact same kernel with same triton rev results in different formats occasionally. Although,...
Ah, you're right. Calling `module.verify()` on the second example, results in ``` error: 'cf.cond_br' op operand #0 must be 1-bit signless integer, but got 'tensor' ``` I assume in these...
Could you explain why this is a limitation? I assume something in `CodeGenerator` needs to properly handle scopes. It uses `ast.NodeVisitor` which can handle this case. There's probably a simple...
Yep, I can observe the difference. I have further minified the repro so that dynamo and inductor are no longer generating code, this is based on the inductor generated code....
Nice find @aakhundov ! Unrelated, but it looks like it also marks None as constexpr ``` if arg.param.is_constexpr or arg.param.num in configs[0].equal_to_1 or arg.value is None ``` we should handle...
The CPU time difference is also concerning, perhaps inductor's precompile has inefficiency we need to fix here (fyi @Chillee @eellison)
Should we compare the `functions` from Lark version and this new version to make sure they match? So that we dont accidentally change structures? Not just end to end but...