text-generation-inference AttributeError: no attribute 'model' when using llava-next with lora-adapters

AttributeError: no attribute 'model' when using llava-next with lora-adapters

Open derkleinejakob opened this issue 9 months ago • 0 comments

System Info

versions:

text-generation-inference: latest docker image
os: Debian GNU/Linux 11
model: llava-hf/llava-v1.6-mistral-7b-hf

Information

[x] Docker
[ ] The CLI directly

Tasks

[x] An officially supported command
[ ] My own modifications

Reproduction

Finetune llava-next using SFTTrainer with LoraConfig and save using trainer.save_model(...) (in my case to ~/.cache/huggingface/adapters/test-local-7)
Start TGI with the lora-adapter

model="llava-hf/llava-v1.6-mistral-7b-hf"
volume=~/.cache/huggingface/
lora="test-local-7=/data/adapters/test-local-7/"

docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data \
    ghcr.io/huggingface/text-generation-inference:latest \
    --model-id $model --lora-adapters $lora

Expected behavior

Expected TGI to run, but fails with:

AttributeError: 'LlavaNextForConditionalGeneration' object has no attribute 'model'

The error is raised in:

 /usr/src/server/text_generation_server/models/__init__.py:1459 in          
│ get_model_with_lora_adapters                                               
│                                                                             
│   1456  │   │   │                                                             
│   1457  │   │   │   for layer_name in adapter_layers:                        
│   1458  │   │   │   │   nlayers = (                                          
│ ❱ 1459  │   │   │   │   │   1 if layer_name == "lm_head" else len(model.model 
│   1460  │   │   │   │   )                                                     
│   1461  │   │   │   │   adapter_weights = LoraWeights.prepare_weights(        
│   1462  │   │   │   │   │   config=adapter_config,

and an excerpt from the locals:

│ ╭───────────────────────────────── locals ─────────────────────────────────╮ │
│ │       adapter_config = LoraConfig(  
...
│ │                        base_model_name_or_path='llava-hf/llava-v1.6-mis…
│ │                        │   r=16, ...
│ │                        │   target_modules={ ...
│ │                        │   },      
│ │                        )               
...                                    
│ │       model = <text_generation_server.models.vlm_causal_lm.Vlm… 
│ │                        object at 0x7f316293c4d0>                     
...   
│ │       model_id = 'llava-hf/llava-v1.6-mistral-7b-hf'  
...

traceback:

2025-01-20T13:19:22.228541Z  INFO text_generation_launcher: Loading adapter weights into model: test-local-7
2025-01-20T13:19:22.266211Z ERROR text_generation_launcher: Error when initializing model
Traceback (most recent call last):
  File "/opt/conda/bin/text-generation-server", line 10, in <module>
    sys.exit(app())
  File "/opt/conda/lib/python3.11/site-packages/typer/main.py", line 323, in __call__
    return get_command(self)(*args, **kwargs)
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1161, in __call__
    return self.main(*args, **kwargs)
  File "/opt/conda/lib/python3.11/site-packages/typer/core.py", line 743, in main
    return _main(
  File "/opt/conda/lib/python3.11/site-packages/typer/core.py", line 198, in _main
    rv = self.invoke(ctx)
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1697, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 1443, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/conda/lib/python3.11/site-packages/click/core.py", line 788, in invoke
    return __callback(*args, **kwargs)
  File "/opt/conda/lib/python3.11/site-packages/typer/main.py", line 698, in wrapper
    return callback(**use_params)
  File "/usr/src/server/text_generation_server/cli.py", line 119, in serve
    server.serve(
  File "/usr/src/server/text_generation_server/server.py", line 315, in serve
    asyncio.run(
  File "/opt/conda/lib/python3.11/asyncio/runners.py", line 190, in run
    return runner.run(main)
  File "/opt/conda/lib/python3.11/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
  File "/opt/conda/lib/python3.11/asyncio/base_events.py", line 641, in run_until_complete
    self.run_forever()
  File "/opt/conda/lib/python3.11/asyncio/base_events.py", line 608, in run_forever
    self._run_once()
  File "/opt/conda/lib/python3.11/asyncio/base_events.py", line 1936, in _run_once
    handle._run()
  File "/opt/conda/lib/python3.11/asyncio/events.py", line 84, in _run
    self._context.run(self._callback, *self._args)
> File "/usr/src/server/text_generation_server/server.py", line 268, in serve_inner
    model = get_model_with_lora_adapters(
  File "/usr/src/server/text_generation_server/models/__init__.py", line 1459, in get_model_with_lora_adapters
    1 if layer_name == "lm_head" else len(model.model.model.layers)
  File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1729, in __getattr__
    raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'LlavaNextForConditionalGeneration' object has no attribute 'model'

Jan 20 '25 13:01 derkleinejakob

text-generation-inference text-generation-inference copied to clipboard

AttributeError: no attribute 'model' when using llava-next with lora-adapters

System Info

versions:

Information

Tasks

Reproduction

Expected behavior

text-generation-inference
text-generation-inference copied to clipboard