Vishal Jain

Results 1 issues of Vishal Jain

### Describe the issue I am trying to quantize and run `Llama-2-7b-hf` model using the example [here](https://github.com/microsoft/onnxruntime-inference-examples/tree/main/quantization/language_model/llama/weight_only_quant). I was able to successfully generate the `int4` model with GPTQ quantization by...

release:1.17.0