text-generation-inference icon indicating copy to clipboard operation
text-generation-inference copied to clipboard

add gptq and awq int4 support in intel platform

Open sywangyi opened this issue 1 year ago • 10 comments

What does this PR do?

Fixes # (issue)

Before submitting

  • [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [ ] Did you read the contributor guideline, Pull Request section?
  • [ ] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • [ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • [ ] Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

sywangyi avatar Aug 22 '24 05:08 sywangyi

@Narsil @danieldk please help review.

sywangyi avatar Aug 22 '24 05:08 sywangyi

@ErikKaum could you help review the PR?

sywangyi avatar Sep 03 '24 01:09 sywangyi

Hi @sywangyi 👋

Yes, let me run the tests in a separate branch so that we don't get the permission errors 👍 I should have time to do it today or tomorrow 👍

ErikKaum avatar Sep 03 '24 08:09 ErikKaum

@ErikKaum @Narsil upload fix for ci, please rerun the ci

sywangyi avatar Sep 10 '24 06:09 sywangyi

@sywangyi there seems still to be an error in the dockerfile:

Dockerfile_intel:154
--------------------
 152 |     RUN git clone https://github.com/intel/intel-extension-for-pytorch && cd intel-extension-for-pytorch && git checkout f86e93e4890dc2c989024d148d415c9aa8a1649f
 153 |     RUN git clone https://github.com/intel/torch-ccl.git && cd torch-ccl && git checkout v2.4.0+cpu+rc0
 154 | >>> RUN cd intel-extension-for-pytorch && git submodule sync && git submodule update --init --recursive && python setup.py install
 155 |     RUN cd torch-ccl && git submodule sync && git submodule update --init --recursive && pip install .
 156 |     
--------------------
ERROR: failed to solve: process "/bin/sh -c cd intel-extension-for-pytorch && git submodule sync && git submodule update --init --recursive && python setup.py install" did not complete successfully: exit code: 1

ErikKaum avatar Sep 10 '24 09:09 ErikKaum

@ErikKaum Could you help to retrigger the CI build/for intel-cpu, since we did not see the build error in previous ci and I have not made any change to the Dockerfile_intel in the new commits

sywangyi avatar Sep 10 '24 12:09 sywangyi

will rework it after https://github.com/huggingface/text-generation-inference/pull/2517 is merged. since python is upgraded from 3.10 to 2.11

sywangyi avatar Sep 12 '24 12:09 sywangyi

@ErikKaum rebase done, please retrigger the CI, review and merge it.

sywangyi avatar Sep 13 '24 01:09 sywangyi

seems the failure is not related with the PR ERROR integration-tests/models/test_flash_medusa.py::test_flash_medusa_simple - RuntimeError: Launcher crashed ERROR integration-tests/models/test_flash_medusa.py::test_flash_medusa_all_params - RuntimeError: Launcher crashed ERROR integration-tests/models/test_flash_medusa.py::test_flash_medusa_load - RuntimeError: Launcher crashed

sywangyi avatar Sep 17 '24 08:09 sywangyi

@ErikKaum could you help retrigger it?

sywangyi avatar Sep 17 '24 08:09 sywangyi

this PR is also needed to make mllama output correct in ipex-cpu. since it will upgrade ipex. could anyone help merge it?

sywangyi avatar Oct 08 '24 03:10 sywangyi

@ErikKaum @Narsil ,please help. @yao-matrix

sywangyi avatar Oct 08 '24 03:10 sywangyi

IT's merged from an updated PR I prepared for CI (https://github.com/huggingface/text-generation-inference/pull/2665) (only minor fixes have been updated in the control flow and adding a few comments).

Narsil avatar Oct 18 '24 16:10 Narsil