text-generation-inference icon indicating copy to clipboard operation
text-generation-inference copied to clipboard

Is bert-base-uncased supported?

Open nick1115 opened this issue 2 years ago • 1 comments

Hi, I'm trying to deploy bert-base-uncased model by v0.5.0, but got an error: ValueError: BertLMHeadModel does not support device_map='auto' yet.

root@nick-test1-8zjwg-135105-worker-0:/usr/local/bin# ./text-generation-launcher --model-id bert-base-uncased
2023-04-14T07:24:23.167920Z  INFO text_generation_launcher: Args { model_id: "bert-base-uncased", revision: None, sharded: None, num_shard: Some(1), quantize: false, max_concurrent_requests: 128, max_best_of: 2, max_stop_sequences: 4, max_input_length: 1000, max_total_tokens: 1512, max_batch_size: 32, max_waiting_tokens: 20, port: 80, shard_uds_path: "/tmp/text-generation-server", master_addr: "localhost", master_port: 29500, huggingface_hub_cache: Some("/data"), weights_cache_override: None, disable_custom_kernels: false, json_output: false, otlp_endpoint: None, cors_allow_origin: [], watermark_gamma: None, watermark_delta: None }
2023-04-14T07:24:23.168401Z  INFO text_generation_launcher: Starting shard 0
2023-04-14T07:24:29.874262Z ERROR shard-manager: text_generation_launcher: "Error when initializing model
Traceback (most recent call last):
  File \"/opt/miniconda/envs/text-generation/bin/text-generation-server\", line 8, in <module>
    sys.exit(app())
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/site-packages/typer/main.py\", line 311, in __call__
    return get_command(self)(*args, **kwargs)
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/site-packages/click/core.py\", line 1130, in __call__
    return self.main(*args, **kwargs)
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/site-packages/typer/core.py\", line 778, in main
    return _main(
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/site-packages/typer/core.py\", line 216, in _main
    rv = self.invoke(ctx)
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/site-packages/click/core.py\", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/site-packages/click/core.py\", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/site-packages/click/core.py\", line 760, in invoke
    return __callback(*args, **kwargs)
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/site-packages/typer/main.py\", line 683, in wrapper
    return callback(**use_params)  # type: ignore
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/site-packages/text_generation_server/cli.py\", line 55, in serve
    server.serve(model_id, revision, sharded, quantize, uds_path)
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/site-packages/text_generation_server/server.py\", line 135, in serve
    asyncio.run(serve_inner(model_id, revision, sharded, quantize))
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/asyncio/runners.py\", line 44, in run
    return loop.run_until_complete(main)
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/asyncio/base_events.py\", line 634, in run_until_complete
    self.run_forever()
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/asyncio/base_events.py\", line 601, in run_forever
    self._run_once()
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/asyncio/base_events.py\", line 1905, in _run_once
    handle._run()
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/asyncio/events.py\", line 80, in _run
    self._context.run(self._callback, *self._args)
> File \"/opt/miniconda/envs/text-generation/lib/python3.9/site-packages/text_generation_server/server.py\", line 104, in serve_inner
    model = get_model(model_id, revision, sharded, quantize)
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/site-packages/text_generation_server/models/__init__.py\", line 130, in get_model
    return CausalLM(model_id, revision, quantize=quantize)
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/site-packages/text_generation_server/models/causal_lm.py\", line 308, in __init__
    self.model = AutoModelForCausalLM.from_pretrained(
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/site-packages/transformers-4.28.0.dev0-py3.9-linux-x86_64.egg/transformers/models/auto/auto_factory.py\", line 471, in from_pretrained
    return model_class.from_pretrained(
  File \"/opt/miniconda/envs/text-generation/lib/python3.9/site-packages/transformers-4.28.0.dev0-py3.9-linux-x86_64.egg/transformers/modeling_utils.py\", line 2644, in from_pretrained
    raise ValueError(f\"{model.__class__.__name__} does not support `device_map='{device_map}'` yet.\")
ValueError: BertLMHeadModel does not support `device_map='auto'` yet.
" rank=0
2023-04-14T07:24:30.475420Z ERROR text_generation_launcher: Shard 0 failed to start.
2023-04-14T07:24:30.475495Z  INFO text_generation_launcher: Shutting down shards

nick1115 avatar Apr 14 '23 07:04 nick1115

I don't think that model is in supported list but the error was resolved by adding CUDA_VISIBLE_DEVICES env variable (https://github.com/huggingface/text-generation-inference/issues/337#issuecomment-1551571575)

CUDA_VISIBLE_DEVICES=0 text-generation-launcher ...

seonglae avatar Nov 15 '23 16:11 seonglae