Tensile icon indicating copy to clipboard operation
Tensile copied to clipboard

Is Tensile adapted to RDNA2 ?

Open v01dXYZ opened this issue 2 years ago • 1 comments

Hello, As you may know RDNA2 has a 128MB L3 cache which is an important difference with the GCN/CDNA architecture, it allows to use efficiently a memory subsystem with a smaller bus width (although it has a throughput higher than a Vega 10) with 8 Samsung GDDR6 chips (8x32x16Gbps). Are tensile or MISA adapted to a microarchitecture where caching (ie spatial/temporal locality) is central to achieve peak performance ? Do you think RDNA2 could be as good or even better than a GCN/CDNA architecture for GEMM by conserving as longly as possible blocks in the L3 cache ? As we have 128 MB / 160 wavefronts ~= 800 KB per wavefront (160 wavefronts = 80 CU * 2 concurrent 32-lane wavefronts per CU). It is not far away from the L2 cache we found on CPU (Ryzen 5xxx series: 512 KB L2 cache).

v01dXYZ avatar Aug 29 '22 16:08 v01dXYZ

Yes Tensile has support for RDNA2, assigning this to @TonyYHsieh for further support

bragadeesh avatar Nov 01 '22 17:11 bragadeesh

@v01dXYZ Do you still need assistance with this ticket? If not, please close the ticket. Thanks!

ppanchad-amd avatar Jul 15 '24 20:07 ppanchad-amd