James Whedbee

Results 1 issues of James Whedbee

### Some context I am using AMD MI100 GPUs and I can get ~33 tokens/second for Llama 2 70B using - compile - tensor parallelism of 8 - int8 quantization...