text-generation-inference icon indicating copy to clipboard operation
text-generation-inference copied to clipboard

CI for add gptq and awq int4 support in intel platform

Open ErikKaum opened this issue 1 year ago • 3 comments

Run CI for pr #2444

ErikKaum avatar Sep 05 '24 15:09 ErikKaum

I check the 3 failure cases and find they are related with the bug fix in https://github.com/huggingface/text-generation-inference/pull/2444/files#diff-d8aff332cf9104dd7460d2f53575239dc1f4bcdd374e575b8a504568bfc2e078R325. which will cause "Narsil/starcoder-gptq" 2 TP not to use exllama kernel. if you check 1TP of this model, which is using exllama kernel. the generation result is close to the current 2TP result which is using exllama.

sywangyi avatar Sep 06 '24 02:09 sywangyi

@Narsil to comment.

sywangyi avatar Sep 06 '24 02:09 sywangyi

seems the failure is not related with the PR ERROR integration-tests/models/test_flash_medusa.py::test_flash_medusa_simple - RuntimeError: Launcher crashed ERROR integration-tests/models/test_flash_medusa.py::test_flash_medusa_all_params - RuntimeError: Launcher crashed ERROR integration-tests/models/test_flash_medusa.py::test_flash_medusa_load - RuntimeError: Launcher crashed

sywangyi avatar Sep 13 '24 10:09 sywangyi

Merged in https://github.com/huggingface/text-generation-inference/pull/2665

Narsil avatar Oct 25 '24 08:10 Narsil