How to use performance of the "FP16 Tensor TFLOPs with FP16 Accumulate"

Open guitanj opened this issue 9 months ago • 2 comments

I noticed that there are so many graphics cards with Tensor Cores, peak performance of the "FP16 Tensor TFLOPs with FP16 Accumulate" is twice than peak performance of the "FP16 Tensor TFLOPs with FP32 Accumulate". Can KataGo potentially switch to using the performance of the “FP16 Tensor TFLOPs with FP16 Accumulate”? Furthermore, would it be possible to utilize FP8 performance? This way, the powerful computing power of the Ada architecture graphics cards could be fully leveraged.

Apr 09 '25 09:04 guitanj