text-generation-inference AttributeError: 'NoneType' object has no attribute 'replace'

System Info

Docker image: ghcr.io/huggingface/text-generation-inference:2.2.0-rocm Hardware: AMD MI250

Information

[X] Docker
[ ] The CLI directly

Tasks

[x] An officially supported command
[ ] My own modifications

Reproduction

run docker run --device /dev/kfd --device /dev/dri ghcr.io/huggingface/text-generation-inference:2.2.0-rocm --model-id TinyLlama/TinyLlama-1.1B-Chat-v1.0
wait until the warmup step

Expected behavior

The model should be deployed as it's officially supported but I get:

File "/opt/conda/lib/python3.10/site-packages/text_generation_server/models/flash_causal_lm.py", line 1160, in warmup
    f"tunableop_{MODEL_ID.replace('/', '-')}_tp{self.world_size}_rank{self.rank}.csv",
AttributeError: 'NoneType' object has no attribute 'replace'

Jul 24 '24 13:07 almersawi

Thanks for reporting this and adding the PR 🙌

We're a bit low on bandwidth but can hopefully take a look at it asap 👍

Jul 29 '24 08:07 ErikKaum

@almersawi You can continue by disabling PYTORCH_TUNABLEOP feature while starting up. its starting up the server still without warmup. run docker run --device /dev/kfd --device /dev/dri -e PYTORCH_TUNABLEOP_ENABLED=0 ghcr.io/huggingface/text-generation-inference:2.2.0-rocm --model-id TinyLlama/TinyLlama-1.1B-Chat-v1.0

Jul 30 '24 13:07 kaustubhrm