text-generation-inference CI for add gptq and awq int4 support in intel platform

Run CI for pr #2444

Sep 05 '24 15:09 ErikKaum

I check the 3 failure cases and find they are related with the bug fix in https://github.com/huggingface/text-generation-inference/pull/2444/files#diff-d8aff332cf9104dd7460d2f53575239dc1f4bcdd374e575b8a504568bfc2e078R325. which will cause "Narsil/starcoder-gptq" 2 TP not to use exllama kernel. if you check 1TP of this model, which is using exllama kernel. the generation result is close to the current 2TP result which is using exllama.

Sep 06 '24 02:09 sywangyi

@Narsil to comment.

Sep 06 '24 02:09 sywangyi

seems the failure is not related with the PR ERROR integration-tests/models/test_flash_medusa.py::test_flash_medusa_simple - RuntimeError: Launcher crashed ERROR integration-tests/models/test_flash_medusa.py::test_flash_medusa_all_params - RuntimeError: Launcher crashed ERROR integration-tests/models/test_flash_medusa.py::test_flash_medusa_load - RuntimeError: Launcher crashed

Sep 13 '24 10:09 sywangyi

Merged in https://github.com/huggingface/text-generation-inference/pull/2665

Oct 25 '24 08:10 Narsil