ray-llm Error: Tokenizer class does not exist when load local model

Error: Tokenizer class does not exist when load local model

Open YixuanCao opened this issue 1 year ago • 1 comments

When I try to load a local model, an error raised: ValueError: Tokenizer class BaichuanTokenizer does not exist or is not currently imported. I have set trust_remote_code=True.

I used to use vllm on this model, and it works well.

(ServeController pid=67769) Unhandled error (suppress with 'RAY_IGNORE_UNHANDLED_ERRORS=1'): ray::ServeReplica:ray-llm-myapp-baichuan:VLLMDeployment:pai--myapp-baichuan2-13b-chat-2.initialize_and_get_metadata() (pid=72307, ip=172.17.0.2, actor_id=438f9032f1ec94c824d6519d01000000, repr=<ray.serve._private.replica.ServeReplica:ray-llm-myapp-baichuan:VLLMDeployment:pai--myapp-baichuan2-13b-chat-2 object at 0x7f4e97177c70>)                                                                         (ServeController pid=67769)   File "/home/ray/anaconda3/lib/python3.9/concurrent/futures/_base.py", line 439, in result                                                                                                                                     (ServeController pid=67769)     return self.__get_result()                                                                                                                                                                                                  (ServeController pid=67769)   File "/home/ray/anaconda3/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result                                                                                                                               (ServeController pid=67769)     raise self._exception                                                                                                                                                                                                       (ServeController pid=67769)   File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/replica.py", line 442, in initialize_and_get_metadata
(ServeController pid=67769)     raise RuntimeError(traceback.format_exc()) from None
(ServeController pid=67769) RuntimeError: Traceback (most recent call last):
(ServeController pid=67769)   File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/replica.py", line 430, in initialize_and_get_metadata
(ServeController pid=67769)     await self._initialize_replica()
(ServeController pid=67769)   File "/home/ray/anaconda3/lib/python3.9/site-packages/ray/serve/_private/replica.py", line 190, in initialize_replica
(ServeController pid=67769)     await sync_to_async(_callable.__init__)(*init_args, **init_kwargs)
(ServeController pid=67769)   File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/server/vllm/vllm_deployment.py", line 37, in __init__
(ServeController pid=67769)     await self.engine.start()
(ServeController pid=67769)   File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/llm/vllm/vllm_engine.py", line 78, in start
(ServeController pid=67769)     pg, runtime_env = await self.node_initializer.initialize_node(self.llm_app)
(ServeController pid=67769)   File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/llm/vllm/vllm_node_initializer.py", line 52, in initialize_node
(ServeController pid=67769)     await self._initialize_local_node(engine_config)
(ServeController pid=67769)   File "/home/ray/anaconda3/lib/python3.9/concurrent/futures/thread.py", line 58, in run
(ServeController pid=67769)     result = self.fn(*self.args, **self.kwargs)
(ServeController pid=67769)   File "/home/ray/anaconda3/lib/python3.9/site-packages/rayllm/backend/llm/vllm/vllm_node_initializer.py", line 72, in _initialize_local_node
(ServeController pid=67769)     _ = AutoTokenizer.from_pretrained(engine_config.actual_hf_model_id)
(ServeController pid=67769)   File "/home/ray/anaconda3/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py", line 748, in from_pretrained
(ServeController pid=67769)     raise ValueError(
(ServeController pid=67769) ValueError: Tokenizer class BaichuanTokenizer does not exist or is not currently imported.

model yaml

enabled: true
deployment_config:
  autoscaling_config:
    min_replicas: 1
    initial_replicas: 1
    max_replicas: 2
    target_num_ongoing_requests_per_replica: 1.0
    metrics_interval_s: 10.0
    look_back_period_s: 30.0
    smoothing_factor: 1.0
    downscale_delay_s: 300.0
    upscale_delay_s: 90.0
  ray_actor_options:
    num_cpus: 4
engine_config:
  model_id: pai/myapp-baichuan2-13b-chat-2
  hf_model_id: /opt/models/myapp-baichuan2-13b-chat-2/
  engine_kwargs:
    trust_remote_code: true
  runtime_env:
    env_vars:
      YOUR_ENV_VAR: "your_value"
  generation:
    prompt_format:
      system: "{instruction}\n"  # System message. Will default to default_system_message
      assistant: "### Response:\n{instruction}\n"  # Past assistant message. Used in chat completions API.
      trailing_assistant: "### Response:\n"  # New assistant message. After this point, model will generate tokens.
      user: "### Instruction:\n{instruction}\n"  # User message.
      default_system_message: "Below is an instruction that describes a task. Write a response that appropriately completes the request."  # Default system message.
      system_in_user: false  # Whether the system prompt is inside the user prompt. If true, the user field should include '{system}'
      add_system_tags_even_if_message_is_empty: false  # Whether to include the system tags even if the user message is empty.
      strip_whitespace: false  # Whether to automaticall strip whitespace from left and right of user supplied messages for chat completions
    stopping_sequences: ["### Response:", "### End"]

scaling_config:
  num_workers: 1
  num_gpus_per_worker: 1
  num_cpus_per_worker: 4

serve config:

applications:
- name: ray-llm-myapp-baichuan
  route_prefix: /
  import_path: rayllm.backend:router_application
  args:
    models:
      - "/data/ray-llm/serve_configs/baichuan2-13b-chat.yaml"

Dec 22 '23 03:12 YixuanCao

update transformers and try again

pip install transformers -U

Apr 17 '24 09:04 nkwangleiGIT

ray-llm ray-llm copied to clipboard

Error: Tokenizer class does not exist when load local model

ray-llm
ray-llm copied to clipboard