drazi
drazi
@brandon-yujie-sun to keep track
Good suggestion! It's a very useful feature that we are considering to add ( ETA is TBD ).
@brandon-yujie-sun
@thakkarV is right. To simplify the first release, we don't support 64bit or mixed integer type in CuTe algebra. This limitation will be removed in future release. Thanks for catching....
> [@ccecka](https://github.com/ccecka) where are the dependencies coming from? (e.g. congruent, coprofile, etc.) > > and yeah, in my testing I found that matplotlib ends up being a bottleneck at larger...
> **What is your question?** cute.copy will always fully unroll its inner load/store. But in some case, the unrolling in cute.copy will case serious register spill. So I wonder how...
@brandon-yujie-sun it might be useful to add unrolling control to API for copy & gemm?
> When I compile cutdsl from source and run `import cutlass`, I get the error "No module named 'cutlass._mlir'". I'd like to know what operations need to be performed on...
> There is only one ld.shared.v4, it seems to do a memory optimize pass to reuse the register. Do you mind elaborating this a bit more?
I suspect that for global memory, compiler can't assume no other thread can write to the same location so it shouldn't optimize out global memory load. For shared memory, it's...