GenAIExamples
GenAIExamples copied to clipboard
[BUG] TGI versions inconsistency / use of old TGI versions
Currently latest used TGI versions in this repo are v2.3.1 (Gaudi) / v2.4.1 (CPU).
However there are several files where much older versions are used.
GenAIExamples, old CPU/rocm versions:
GenAIExamples$ git grep text-generation-inference: | grep -v -e github -e 2.[34].[01]
AudioQnA/kubernetes/gmc/README.md:- tgi-service: ghcr.io/huggingface/text-generation-inference:1.4
ChatQnA/docker_compose/nvidia/gpu/compose.yaml: image: ghcr.io/huggingface/text-generation-inference:2.2.0
DBQnA/docker_compose/intel/cpu/xeon/README.md:docker run -d --name="test-text2sql-tgi-endpoint" --ipc=host -p $TGI_PORT:80 -v ./data:/data --shm-size 1g -e HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN} -e HF_TOKEN=${HF_TOKEN} -e model=${model} ghcr.io/huggingface/text-generation-inference:2.1.0 --model-id $model
DBQnA/docker_compose/intel/cpu/xeon/compose.yaml: image: ghcr.io/huggingface/text-generation-inference:2.1.0
DocSum/tests/test_compose_on_rocm.sh: docker pull ghcr.io/huggingface/text-generation-inference:1.4
DocSum/tests/test_compose_on_xeon.sh: docker pull ghcr.io/huggingface/text-generation-inference:1.4
FaqGen/tests/test_compose_on_xeon.sh: docker pull ghcr.io/huggingface/text-generation-inference:1.4
MultimodalQnA/docker_compose/amd/gpu/rocm/compose.yaml: image: ghcr.io/huggingface/text-generation-inference:3.0.1-rocm
GenAIComps, old CPU versions:
GenAIComps$ git grep text-generation-inference: | grep -v -e github -e 2.[34].[01]
comps/text2sql/src/README.md:docker run -d --name="text2sql-tgi-endpoint" --ipc=host -p $TGI_PORT:80 -v ./data:/data --shm-size 1g -e HF_TOKEN=${HUGGINGFACEHUB_API_TOKEN} -e model=${LLM_MODEL_ID} ghcr.io/huggingface/text-generation-inference:2.1.0 --model-id $LLM_MODEL_ID
GenAIExamples, old Gaudi versions (latest used version is 2.3.1):
$ git grep tgi-gaudi:2.0 | wc -l
40
PS. All TEI image references are for 1.5 version, i.e. consistent.
@eero-t Thanks! Good catch.
Do you have plan to submit PR to fix this?
Do you have plan to submit PR to fix this?
@xiguiw No.
(Fixing this could be a good "beginner" / "first time" task PR.)
OPEA_Team4 is working on this issue
Hi @xiaotia3 thank you a lot for the contribution. will remind team to review. thanks
PS. All TEI image references are for 1.5.0 version, i.e. consistent.
But somewhat out of date. 1.5.0 was released last summer, whereas latest tei-gaudi release is 1.5.3 (and "GenAIComps" project use 1.5.2): https://github.com/huggingface/tei-gaudi/releases
GenAIComps CPU/rocm TGI is now consistent version, but this repo is not quite done yet, there's still lot of discrepancy.
While most are now on TGI 2.4.x, some references to older version still exist:
GenAIExamples$ git grep text-generation-inference: | wc -l
87
GenAIExamples$ git grep text-generation-inference: | grep -v 2.4.1 | wc -l
55
GenAIExamples$ git grep text-generation-inference: | grep -v 2.4 | wc -l
20
Same thing with Gaudi version:
GenAIExamples$ git grep /tgi-gaudi: | wc -l
51
GenAIExamples$ git grep /tgi-gaudi: | grep -v 2.3.1 | wc -l
8
Also in GenAIComps:
GenAIComps$ git grep /tgi-gaudi: | wc -l
10
GenAIComps$ git grep /tgi-gaudi: | grep -v 2.3.1 | wc -l
4
@chensuyue please re-open.
@xiaotia3 will you continue submit PR for this issue?
@xiaotia3 will you continue submit PR for this issue?
I will. The old version images that still exist were likely introduced during the period when the PR was trying to be merged. Let me update them.
And due to known issues, ChatQnA and AvatarChatbot may not be updated , is it ok?
And due to known issues, ChatQnA and AvatarChatbot may not be updated , is it ok?
Those can be updated in a separate PR after their issues have been fixed.
OPEA_Team4 is working on this issue
@zhanmyz @xiaotia3 @1625 is merged.
Thanks for your contributions. Found two TGI images.
Would you please help on this? Thanks!
GenAIExample DocSum/tests/test_compose_tgi_on_xeon.sh: docker pull ghcr.io/huggingface/text-generation-inference:1.4
GenAIComp comps/text2sql/src/README.md: docker run -d --name="text2sql-tgi-endpoint" --ipc=host -p $TGI_PORT:80 -v ./data:/data --shm-size 1g -e HF_TOKEN=${HUGGINGFACEHUB_API_TOKEN} -e model=${LLM_MODEL_ID} ghcr.io/huggingface/text-generation-inference:2.1.0 --model-id $LLM_MODEL_ID