Eagle AttributeError: 'NoneType' object has no attribute '_parameters' during model dispatch in Docker environment

I am encountering an AttributeError when trying to deploy the nveagle-eagle-x5-13b-chat model using Docker. The error occurs deep within the accelerate library's hook management, specifically when trying to attach or detach hooks for device alignment. Here is the error trace:

Traceback (most recent call last): File "/home/user/app/app.py", line 47, in tokenizer, model, image_processor, context_len = load_pretrained_model(... ... File "/usr/local/lib/python3.10/site-packages/accelerate/hooks.py", line 313, in detach_hook set_module_tensor_to_device(module, name, device, value=self.weights_map.get(name, None)) File "/usr/local/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 261 if tensor_name not in module._parameters and tensor_name not in module._buffers: AttributeError: 'NoneType' object has no attribute '_parameters'

The error suggests that a module expected to have parameters and buffers is None at the time of the hook operation. This issue arises specifically when deploying via Docker with the command:

docker run -it -p 7860:7860 --platform=linux/amd64 --gpus all registry.hf.space/nveagle-eagle-x5-13b-chat python app.py

Any insights or suggestions on how to address this issue would be greatly appreciated.

Sep 05 '24 07:09 hl2dm

That's an issue with accelerate and transformer packages. You need to upgrade these two packages. It worked for me. pip install -U accelerate pip install -U transformers

Sep 08 '24 03:09 thebazshah

@thebazshah is work but

2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | /usr/local/lib/python3.10/site-packages/transformers/generation/utils.py:1375: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use and modify the model generation configuration (see https://huggingface.co/docs/transformers/generation_strategies#default-text-generation-configuration ) 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | warnings.warn( 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | Exception in thread Thread-7 (generate): 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | Traceback (most recent call last): 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/threading.py", line 1016, in _bootstrap_inner 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | self.run() 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/threading.py", line 953, in run 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | self._target(*self._args, **self._kwargs) 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | return func(*args, **kwargs) 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | File "/home/user/app/eagle/model/language_model/eagle_llama.py", line 137, in generate 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | return super().generate( 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | return func(*args, **kwargs) 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/transformers/generation/utils.py", line 2024, in generate 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | result = self._sample( 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/transformers/generation/utils.py", line 2982, in _sample 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | outputs = self(**model_inputs, return_dict=True) 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | return self._call_impl(*args, **kwargs) 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | return forward_call(*args, **kwargs) 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/accelerate/hooks.py", line 170, in new_forward 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | output = module._old_forward(*args, **kwargs) 2024-09-10 17:39:10 nveagle-eagle-x5-13b-chat-1 | TypeError: EagleLlamaForCausalLM.forward() got an unexpected keyword argument 'cache_position' 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | Traceback (most recent call last): 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | response = await route_utils.call_process_api( 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/gradio/route_utils.py", line 321, in call_process_api 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | output = await app.get_blocks().process_api( 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1935, in process_api 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | result = await self.call_function( 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/gradio/blocks.py", line 1532, in call_function 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | prediction = await utils.async_iteration(iterator) 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 671, in async_iteration 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | return await iterator.anext() 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 664, in anext 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | return await anyio.to_thread.run_sync( 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | return await get_async_backend().run_sync_in_worker_thread( 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | return await future 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 859, in run 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | result = context.run(func, *args) 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 647, in run_sync_iterator_async 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | return next(iterator) 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/gradio/utils.py", line 809, in gen_wrapper 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | response = next(iterator) 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | File "/home/user/app/app.py", line 153, in generate 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | for new_text in streamer: 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/site-packages/transformers/generation/streamers.py", line 223, in next 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | value = self.text_queue.get(timeout=self.timeout) 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | File "/usr/local/lib/python3.10/queue.py", line 179, in get 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | raise Empty 2024-09-10 17:39:24 nveagle-eagle-x5-13b-chat-1 | _queue.Empty

Unexpected Keyword Argument Error: When attempting to generate text, the system throws a TypeError, indicating an unexpected keyword argument 'cache_position' in the forward call.

I have seen the same error in other language models. I tried changing the version but still can't solve it.

Sep 10 '24 09:09 hl2dm