LiuWei
LiuWei
> It will be in 2.9 Hi, 2.9 has been released. However, I didnot see this useful script. Would you please tell me where it is?
> It is not ready yet. we will add it as a patch. Ok. Do we have some expected release date? We need it.
> It is not ready yet. we will add it as a patch. @hwu36 I see this scipt that you construct an ir for tf for gemm fusion? I guess...
> > It is not ready yet. we will add it as a patch. > > Ok. Do we have some expected release date? We need it. @hwu36 we need...
> @lw921014 could you please post the thread block and warp tile sizes? In case you haven't tried it, please sanity check if the following requirements are met: problem_N =...
> Correct. Need problem0_N = problem1_N ?
> Correct. For this group parameter, we satisfied problem0_N = threadblock0_N = warp0_N and problem1_N = threadblock1_N = warp1_N, but still failed.
> You'll need the same number of warps for each GEMM. > > In your example above, you use 4 warps for the 1st GEMM, but use 2 warps for...
> 32 sorry. It is my mistake.
> I change as this it run ok. > Note that this may not be performant compared with non-fused case due to small warp_M=16. Also the large warp_N=256 causes RF...