aphrodite-engine
aphrodite-engine copied to clipboard
Initial fetch for `config.json` ignores `--revision`?
If I set CMD_ADDITIONAL_ARGUMENTS
to --model turboderp/Mistral-7B-instruct-exl2 --revision 4.0bpw
Then I get this error:
2024-03-13T14:03:42.164428603Z + exec python3 -m aphrodite.endpoints.openai.api_server --host 0.0.0.0 --port 5000 --download-dir /app/tmp/hub --max-model-len 4096 --quantization exl2 --enforce-eager --model turboderp/Mistral-7B-instruct-exl2 --revision 4.0bpw --download-dir /volume/hub
2024-03-13T14:03:44.082470629Z WARNING: exl2 quantization is not fully optimized yet. The speed can be slower
2024-03-13T14:03:44.082490019Z than non-quantized models.
2024-03-13T14:03:44.084028269Z INFO: Initializing the Aphrodite Engine (v0.5.0) with the following config:
2024-03-13T14:03:44.084035559Z INFO: Model = 'turboderp/Mistral-7B-instruct-exl2'
2024-03-13T14:03:44.084039269Z INFO: DataType = torch.bfloat16
2024-03-13T14:03:44.084042909Z INFO: Model Load Format = auto
2024-03-13T14:03:44.084045799Z INFO: Number of GPUs = 1
2024-03-13T14:03:44.084048349Z INFO: Disable Custom All-Reduce = False
2024-03-13T14:03:44.084050519Z INFO: Quantization Format = exl2
2024-03-13T14:03:44.084052649Z INFO: Context Length = 4096
2024-03-13T14:03:44.084057519Z INFO: Enforce Eager Mode = True
2024-03-13T14:03:44.084059709Z INFO: KV Cache Data Type = auto
2024-03-13T14:03:44.084061789Z INFO: KV Cache Params Path = None
2024-03-13T14:03:44.084063869Z INFO: Device = cuda
2024-03-13T14:03:44.492961433Z Traceback (most recent call last):
2024-03-13T14:03:44.492985083Z File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
2024-03-13T14:03:44.492988443Z response.raise_for_status()
2024-03-13T14:03:44.492991203Z File "/usr/local/lib/python3.10/dist-packages/requests/models.py", line 1021, in raise_for_status
2024-03-13T14:03:44.492993893Z raise HTTPError(http_error_msg, response=self)
2024-03-13T14:03:44.492996533Z requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://huggingface.co/turboderp/Mistral-7B-instruct-exl2/resolve/main/config.json
2024-03-13T14:03:44.492999293Z
2024-03-13T14:03:44.493001403Z The above exception was the direct cause of the following exception:
2024-03-13T14:03:44.493003813Z
2024-03-13T14:03:44.493005773Z Traceback (most recent call last):
2024-03-13T14:03:44.493008093Z File "/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py", line 398, in cached_file
2024-03-13T14:03:44.493010223Z resolved_file = hf_hub_download(
2024-03-13T14:03:44.493012273Z File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
2024-03-13T14:03:44.493014363Z return fn(*args, **kwargs)
2024-03-13T14:03:44.493016513Z File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 1261, in hf_hub_download
2024-03-13T14:03:44.493018643Z metadata = get_hf_file_metadata(
2024-03-13T14:03:44.493020723Z File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn
2024-03-13T14:03:44.493022793Z return fn(*args, **kwargs)
2024-03-13T14:03:44.493024903Z File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 1667, in get_hf_file_metadata
2024-03-13T14:03:44.493026983Z r = _request_wrapper(
2024-03-13T14:03:44.493029103Z File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 385, in _request_wrapper
2024-03-13T14:03:44.493031173Z response = _request_wrapper(
2024-03-13T14:03:44.493033263Z File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py", line 409, in _request_wrapper
2024-03-13T14:03:44.493035313Z hf_raise_for_status(response)
2024-03-13T14:03:44.493041563Z huggingface_hub.utils._errors.EntryNotFoundError: 404 Client Error. (Request ID: Root=1-65f1b240-7d5d7d3b668248e21867e88e;d37da62a-3494-4c58-91fd-28dda5419afb)
2024-03-13T14:03:44.493043843Z
2024-03-13T14:03:44.493045873Z Entry Not Found for url: https://huggingface.co/turboderp/Mistral-7B-instruct-exl2/resolve/main/config.json.
2024-03-13T14:03:44.493062953Z
2024-03-13T14:03:44.493066373Z The above exception was the direct cause of the following exception:
2024-03-13T14:03:44.493068993Z
2024-03-13T14:03:44.493071043Z Traceback (most recent call last):
2024-03-13T14:03:44.493073083Z File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
2024-03-13T14:03:44.493075173Z return _run_code(code, main_globals, None,
2024-03-13T14:03:44.493077243Z File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
2024-03-13T14:03:44.493079313Z exec(code, run_globals)
2024-03-13T14:03:44.493081353Z File "/app/aphrodite-engine/aphrodite/endpoints/openai/api_server.py", line 561, in <module>
2024-03-13T14:03:44.493083673Z engine = AsyncAphrodite.from_engine_args(engine_args)
2024-03-13T14:03:44.493085783Z File "/app/aphrodite-engine/aphrodite/engine/async_aphrodite.py", line 676, in from_engine_args
2024-03-13T14:03:44.493087773Z engine = cls(parallel_config.worker_use_ray,
2024-03-13T14:03:44.493089813Z File "/app/aphrodite-engine/aphrodite/engine/async_aphrodite.py", line 341, in __init__
2024-03-13T14:03:44.493091913Z self.engine = self._init_engine(*args, **kwargs)
2024-03-13T14:03:44.493093943Z File "/app/aphrodite-engine/aphrodite/engine/async_aphrodite.py", line 410, in _init_engine
2024-03-13T14:03:44.493095973Z return engine_class(*args, **kwargs)
2024-03-13T14:03:44.493098053Z File "/app/aphrodite-engine/aphrodite/engine/aphrodite_engine.py", line 102, in __init__
2024-03-13T14:03:44.493100183Z self._init_tokenizer()
2024-03-13T14:03:44.493102283Z File "/app/aphrodite-engine/aphrodite/engine/aphrodite_engine.py", line 166, in _init_tokenizer
2024-03-13T14:03:44.493104343Z self.tokenizer: TokenizerGroup = TokenizerGroup(
2024-03-13T14:03:44.493106503Z File "/app/aphrodite-engine/aphrodite/transformers_utils/tokenizer.py", line 157, in __init__
2024-03-13T14:03:44.493108583Z self.tokenizer = get_tokenizer(self.tokenizer_id, **tokenizer_config)
2024-03-13T14:03:44.493110623Z File "/app/aphrodite-engine/aphrodite/transformers_utils/tokenizer.py", line 87, in get_tokenizer
2024-03-13T14:03:44.493112653Z tokenizer = AutoTokenizer.from_pretrained(
2024-03-13T14:03:44.493114713Z File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/tokenization_auto.py", line 782, in from_pretrained
2024-03-13T14:03:44.493116783Z config = AutoConfig.from_pretrained(
2024-03-13T14:03:44.493118833Z File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/configuration_auto.py", line 1111, in from_pretrained
2024-03-13T14:03:44.493120903Z config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
2024-03-13T14:03:44.493122953Z File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 633, in get_config_dict
2024-03-13T14:03:44.493125233Z config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
2024-03-13T14:03:44.493127343Z File "/usr/local/lib/python3.10/dist-packages/transformers/configuration_utils.py", line 688, in _get_config_dict
2024-03-13T14:03:44.493129363Z resolved_config_file = cached_file(
2024-03-13T14:03:44.493131423Z File "/usr/local/lib/python3.10/dist-packages/transformers/utils/hub.py", line 452, in cached_file
2024-03-13T14:03:44.493133483Z raise EnvironmentError(
2024-03-13T14:03:44.493135593Z OSError: turboderp/Mistral-7B-instruct-exl2 does not appear to have a file named config.json. Checkout 'https://huggingface.co/turboderp/Mistral-7B-instruct-exl2/main' for avail