transformers
transformers copied to clipboard
Should update accelerate minimum version requirement to 0.15
System Info
Using Huggingface Inference Endpoints deployment
contents of requirements.txt
file below:
accelerate==0.13.2
bitsandbytes
Who can help?
@sgugger
Information
- [ ] The official example scripts
- [X] My own modified scripts
Tasks
- [ ] An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
Deploy a huggingface inference endpoint with this as the init method of handler.py
class EndpointHandler():
def __init__(self, path: str = ""):
self.tokenizer = AutoTokenizer.from_pretrained(path)
self.model = AutoModelForSeq2SeqLM.from_pretrained(path, device_map="auto", load_in_8bit=True)
and using accelerate < 0.15
. This will lead to the error below
TypeError: dispatch_model() got an unexpected keyword argument 'offload_index'
File "/opt/conda/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2406, in from_pretrained
uuid 2023-03-23T23:02:34.527Z await handler()
uuid 2023-03-23T23:02:34.527Z File "/app/./huggingface_inference_toolkit/utils.py", line 211, in check_and_register_custom_pipeline_from_directory
uuid 2023-03-23T23:02:34.527Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 648, in startup
uuid 2023-03-23T23:02:34.527Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 566, in __aenter__
uuid 2023-03-23T23:02:34.527Z return model_class.from_pretrained(
uuid 2023-03-23T23:02:34.527Z custom_pipeline = handler.EndpointHandler(model_dir)
uuid 2023-03-23T23:02:34.527Z File "/opt/conda/lib/python3.9/site-packages/transformers/models/auto/auto_factory.py", line 463, in from_pretrained
uuid 2023-03-23T23:02:34.527Z File "/app/./huggingface_inference_toolkit/handler.py", line 44, in get_inference_handler_either_custom_or_default_handler
uuid 2023-03-23T23:02:34.527Z File "/opt/conda/lib/python3.9/site-packages/starlette/routing.py", line 671, in lifespan
uuid 2023-03-23T23:02:34.527Z File "/repository/handler.py", line 12, in __init__
uuid 2023-03-23T23:02:34.527Z custom_pipeline = check_and_register_custom_pipeline_from_directory(model_dir)
uuid 2023-03-23T23:02:34.527Z await self._router.startup()
uuid 2023-03-23T23:02:34.527Z async with self.lifespan_context(app):
uuid 2023-03-23T23:02:34.527Z Traceback (most recent call last):
uuid 2023-03-23T23:02:34.527Z dispatch_model(model, device_map=device_map, offload_dir=offload_folder, offload_index=offload_index)
uuid 2023-03-23T23:02:34.527Z inference_handler = get_inference_handler_either_custom_or_default_handler(HF_MODEL_DIR, task=HF_TASK)
uuid 2023-03-23T23:02:34.527Z
uuid 2023-03-23T23:02:34.527Z File "/app/./webservice_starlette.py", line 56, in some_startup_task
uuid 2023-03-23T23:02:34.550Z Application startup failed. Exiting.
Expected behavior
The lowest available accelerate
version should be updated to 0.15, since these PRs add parameters that did not exist before that version:
- https://github.com/huggingface/transformers/pull/20321
- https://github.com/huggingface/accelerate/pull/873
The code handles both versions. Without having the real traceback we can't know what went wrong on our side.
Unfortunately this traceback is as granular as the HF Inference Endpoints logs give me
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.