cydoroga
cydoroga
Thank you a lot! Am I right, that the only reason I get wrong answer are the threadblocks using the memory that should have been locked? If I introduce a...
Hi! Eventually, I've implemented the version of `GemmArray`, that can handle overlapping outputs. It required me to add the `semaphore`, workspace for the `semaphore`, and three new parameters: - `overlap_multiplicity`...
I think, this issue can be closed because I found that the Arguments signature for the ColumnMajor matrices is different from the signature for the RowMajors. Here is the change...
Hi! I'm trying to run triton.ops.blocksparse.matmul but struggling with the error: `matmul = triton.ops.blocksparse.matmul(layout.cpu(), block_size, 'dds', trans_a=False, trans_b=True)` `out = matmul(x, w)` Traceback (most recent call last): .... File "/slot/sandbox/j/.local/lib/python3.6/site-packages/triton/code_gen.py",...
Hi @btyu! Thanks for the answer. As I said, I've tried v2.0 - the latest available version. It does not work. Totally forgotten to mention: I was able to run...