text-generation-inference
text-generation-inference copied to clipboard
AttributeError: no attribute 'model' when using llava-next with lora-adapters
System Info
versions:
- text-generation-inference: latest docker image
- os: Debian GNU/Linux 11
- model: llava-hf/llava-v1.6-mistral-7b-hf
Information
- [x] Docker
- [ ] The CLI directly
Tasks
- [x] An officially supported command
- [ ] My own modifications
Reproduction
- Finetune llava-next using SFTTrainer with LoraConfig and save using
trainer.save_model(...)(in my case to ~/.cache/huggingface/adapters/test-local-7) - Start TGI with the lora-adapter
model="llava-hf/llava-v1.6-mistral-7b-hf"
volume=~/.cache/huggingface/
lora="test-local-7=/data/adapters/test-local-7/"
docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data \
ghcr.io/huggingface/text-generation-inference:latest \
--model-id $model --lora-adapters $lora
Expected behavior
Expected TGI to run, but fails with:
- AttributeError: 'LlavaNextForConditionalGeneration' object has no attribute 'model'
The error is raised in:
/usr/src/server/text_generation_server/models/__init__.py:1459 in
│ get_model_with_lora_adapters
│
│ 1456 │ │ │
│ 1457 │ │ │ for layer_name in adapter_layers:
│ 1458 │ │ │ │ nlayers = (
│ ❱ 1459 │ │ │ │ │ 1 if layer_name == "lm_head" else len(model.model
│ 1460 │ │ │ │ )
│ 1461 │ │ │ │ adapter_weights = LoraWeights.prepare_weights(
│ 1462 │ │ │ │ │ config=adapter_config,
and an excerpt from the locals:
│ ╭───────────────────────────────── locals ─────────────────────────────────╮ │
│ │ adapter_config = LoraConfig(
...
│ │ base_model_name_or_path='llava-hf/llava-v1.6-mis…
│ │ │ r=16, ...
│ │ │ target_modules={ ...
│ │ │ },
│ │ )
...
│ │ model = <text_generation_server.models.vlm_causal_lm.Vlm…
│ │ object at 0x7f316293c4d0>
...
│ │ model_id = 'llava-hf/llava-v1.6-mistral-7b-hf'
...
traceback:
2025-01-20T13:19:22.228541Z INFO text_generation_launcher: Loading adapter weights into model: test-local-7
2025-01-20T13:19:22.266211Z ERROR text_generation_launcher: Error when initializing model
Traceback (most recent call last):
File "/opt/conda/bin/text-generation-server", line 10, in <module>
sys.exit(app())
File "/opt/conda/lib/python3.11/site-packages/typer/main.py", line 323, in __call__
return get_command(self)(*args, **kwargs)
File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1161, in __call__
return self.main(*args, **kwargs)
File "/opt/conda/lib/python3.11/site-packages/typer/core.py", line 743, in main
return _main(
File "/opt/conda/lib/python3.11/site-packages/typer/core.py", line 198, in _main
rv = self.invoke(ctx)
File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1697, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1443, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 788, in invoke
return __callback(*args, **kwargs)
File "/opt/conda/lib/python3.11/site-packages/typer/main.py", line 698, in wrapper
return callback(**use_params)
File "/usr/src/server/text_generation_server/cli.py", line 119, in serve
server.serve(
File "/usr/src/server/text_generation_server/server.py", line 315, in serve
asyncio.run(
File "/opt/conda/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
File "/opt/conda/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
File "/opt/conda/lib/python3.11/asyncio/base_events.py", line 641, in run_until_complete
self.run_forever()
File "/opt/conda/lib/python3.11/asyncio/base_events.py", line 608, in run_forever
self._run_once()
File "/opt/conda/lib/python3.11/asyncio/base_events.py", line 1936, in _run_once
handle._run()
File "/opt/conda/lib/python3.11/asyncio/events.py", line 84, in _run
self._context.run(self._callback, *self._args)
> File "/usr/src/server/text_generation_server/server.py", line 268, in serve_inner
model = get_model_with_lora_adapters(
File "/usr/src/server/text_generation_server/models/__init__.py", line 1459, in get_model_with_lora_adapters
1 if layer_name == "lm_head" else len(model.model.model.layers)
File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1729, in __getattr__
raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'LlavaNextForConditionalGeneration' object has no attribute 'model'