Jack Kosaian

Results 15 comments of Jack Kosaian

@jeromeku, It sounds like you've already figured out where new `Arguments` should be placed: [here](https://github.com/NVIDIA/cutlass/blob/47a3ebbea9860e14c095b52c4e6e2db33340f572/include/cutlass/gemm/kernel/gemm_grouped.h#L130). You'll also need to add them to the kernel's `Params` struct [here](https://github.com/NVIDIA/cutlass/blob/47a3ebbea9860e14c095b52c4e6e2db33340f572/include/cutlass/gemm/kernel/gemm_grouped.h#L217), similar to how...

Any sort of padding would need to be handled externally to `can_implement`. You would need to pad your tensors, problem shapes, etc. before setting them in the `Arguments` struct.

> Are there any examples of gather / scatter fusion and grouped_gemm specifically for Ampere architectures using Cutlass 3.0+ and CuTe? We do not have examples of this. > How...

@yuanjiechen , what values of `N H W C K R S P Q` are you using in your Python example?

Thanks for the details. Can you also tell me the stride, dilation, and padding values you used? I'll look into the alignment issue that you mentioned. Regarding easier support for...