Irr-free

Results 2 issues of Irr-free

Hi! I want to know if llama3 has utilized Tensor Core in its code, and by default, it supports tensor core processing out of the box.

I'm using llama3 with a single Nividia V100 GPU (32GiB memory). When I increase the batch size from 1 to 8, the inference throughput does not increase, but it decreases....