dalle-playground icon indicating copy to clipboard operation
dalle-playground copied to clipboard

400 Client Error, DALL-E Server fails to start in WSL2

Open Zaithe opened this issue 2 years ago • 2 comments

When i launch the server under mini, mega or Mega_full i get this error:

Traceback (most recent call last):
  File "app.py", line 65, in <module>
    dalle_model = DalleModel(args.model_version)
  File "/home/rhodeder/dalle-playground/backend/dalle_model.py", line 61, in __init__
    self.model, params = DalleBart.from_pretrained(
  File "/home/rhodeder/.local/lib/python3.8/site-packages/dalle_mini/model/utils.py", line 23, in from_pretrained
    pretrained_model_name_or_path = artifact.download(tmp_dir)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/wandb/apis/public.py", line 3867, in download
    manifest = self._load_manifest()
  File "/home/rhodeder/.local/lib/python3.8/site-packages/wandb/apis/public.py", line 4136, in _load_manifest
    req.raise_for_status()
  File "/usr/lib/python3/dist-packages/requests/models.py", line 940, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://storage.googleapis.com/wandb-production.appspot.com/dalle-mini/dalle-mini/1cbgmejm/artifact/122141095/wandb_manifest.json?Expires=1655878238&GoogleAccessId=wandb-production%40appspot.gserviceaccount.com&Signature=llx2KDqmrwlnQCYXoWJPtuBeriNuB%2BxtVu7Kryj6imJpq2nTO4CxN5Z%2FxCFbr1EckbiGgfPN1pwFRfKRl87gSjX7TE8XzjswFaElEkFBz0HLWNkhB9R3PndR2oo7d2gBxIRitz68wHAO%2B1e8F%2F9tQs5V5b6pmWzTIWvI5fYfnOBlFyS9TpAaOrMmmluRWTo3xS82n0lGr6lgtKhTUknaXYzIUW9F3hrdSGe7NC%2BimKsmM7tcxmBZLdFqclvtkFAksjF1tMQbbrSyLPkW6G1NWCj%2Fo3VjYT9UbFIQvlD2jTJ7k2lAj4OPYqI5eKD9hHd0TVbowcptS2ddha%2BMa7QYTQ%3D%3D
Traceback (most recent call last):
  File "app.py", line 65, in <module>
    dalle_model = DalleModel(args.model_version)
  File "/home/rhodeder/dalle-playground/backend/dalle_model.py", line 61, in __init__
    self.model, params = DalleBart.from_pretrained(
  File "/home/rhodeder/.local/lib/python3.8/site-packages/dalle_mini/model/utils.py", line 23, in from_pretrained
    pretrained_model_name_or_path = artifact.download(tmp_dir)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/wandb/apis/public.py", line 3867, in download
    manifest = self._load_manifest()
  File "/home/rhodeder/.local/lib/python3.8/site-packages/wandb/apis/public.py", line 4136, in _load_manifest
    req.raise_for_status()
  File "/usr/lib/python3/dist-packages/requests/models.py", line 940, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 400 Client Error: Bad Request for url: https://storage.googleapis.com/wandb-production.appspot.com/dalle-mini/dalle-mini/1cbgmejm/artifact/122141095/wandb_manifest.json?Expires=1655878238&GoogleAccessId=wandb-production%40appspot.gserviceaccount.com&Signature=llx2KDqmrwlnQCYXoWJPtuBeriNuB%2BxtVu7Kryj6imJpq2nTO4CxN5Z%2FxCFbr1EckbiGgfPN1pwFRfKRl87gSjX7TE8XzjswFaElEkFBz0HLWNkhB9R3PndR2oo7d2gBxIRitz68wHAO%2B1e8F%2F9tQs5V5b6pmWzTIWvI5fYfnOBlFyS9TpAaOrMmmluRWTo3xS82n0lGr6lgtKhTUknaXYzIUW9F3hrdSGe7NC%2BimKsmM7tcxmBZLdFqclvtkFAksjF1tMQbbrSyLPkW6G1NWCj%2Fo3VjYT9UbFIQvlD2jTJ7k2lAj4OPYqI5eKD9hHd0TVbowcptS2ddha%2BMa7QYTQ%3D%3D

Any idea whats going on?

Zaithe avatar Jun 22 '22 06:06 Zaithe

I had this issue initially; basically my ISP doesn't properly do ipv6 stuff and some requests would try to reach a server using an ipv6 IP, take forever to time out, and then swap to ipv4 and complete successfully.

Unfortunately in this case, for whatever reason Weights and Biases (wandb) has an expiration on their URLs. Thus if for some reason you aren't able to resolve them/connect to them quickly enough you get this error.

I'd check your network and see if there's anything that might be making these requests take an abnormal amount of time to complete.

trekkie1701c avatar Jun 22 '22 11:06 trekkie1701c

I had this issue initially; basically my ISP doesn't properly do ipv6 stuff and some requests would try to reach a server using an ipv6 IP, take forever to time out, and then swap to ipv4 and complete successfully.

Unfortunately in this case, for whatever reason Weights and Biases (wandb) has an expiration on their URLs. Thus if for some reason you aren't able to resolve them/connect to them quickly enough you get this error.

I'd check your network and see if there's anything that might be making these requests take an abnormal amount of time to complete.

I managed to get past that by using a vpn but now it tosses this error. Not really sure what to do.

python3 app.py --port 8080 --model_version mini
--> Starting DALL-E Server. This might take up to two minutes.
2022-06-22 16:14:45.696237: W external/org_tensorflow/tensorflow/stream_executor/gpu/asm_compiler.cc:111] *** WARNING *** You are using ptxas 11.0.221, which is older than 11.1. ptxas before 11.1 is known to miscompile XLA code, leading to incorrect results or invalid-address errors.

You may not need to update to CUDA 11.1; cherry-picking the ptxas binary is often sufficient.
2022-06-22 16:14:45.697404: W external/org_tensorflow/tensorflow/stream_executor/gpu/asm_compiler.cc:230] Falling back to the CUDA driver for PTX compilation; ptxas does not support CC 8.6
2022-06-22 16:14:45.697450: W external/org_tensorflow/tensorflow/stream_executor/gpu/asm_compiler.cc:233] Used ptxas at /usr/local/cuda/bin/ptxas
2022-06-22 16:14:45.697522: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:287] Couldn't read CUDA driver version.
2022-06-22 16:14:45.701155: E external/org_tensorflow/tensorflow/stream_executor/cuda/cuda_driver.cc:632] failed to get PTX kernel "shift_right_logical_3" from module: CUDA_ERROR_NOT_FOUND: named symbol not found
2022-06-22 16:14:45.701217: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.cc:2141] Execution of replica 0 failed: INTERNAL: Could not find the corresponding function
Traceback (most recent call last):
  File "app.py", line 65, in <module>
    dalle_model = DalleModel(args.model_version)
  File "/home/rhodeder/dalle-playground/backend/dalle_model.py", line 61, in __init__
    self.model, params = DalleBart.from_pretrained(
  File "/home/rhodeder/.local/lib/python3.8/site-packages/dalle_mini/model/utils.py", line 25, in from_pretrained
    return super(PretrainedFromWandbMixin, cls).from_pretrained(
  File "/home/rhodeder/.local/lib/python3.8/site-packages/transformers/modeling_flax_utils.py", line 596, in from_pretrained
    model = cls(config, *model_args, _do_init=_do_init, **model_kwargs)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/transformers/models/bart/modeling_flax_bart.py", line 920, in __init__
    super().__init__(config, module, input_shape=input_shape, seed=seed, dtype=dtype, _do_init=_do_init)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/transformers/modeling_flax_utils.py", line 115, in __init__
    self.key = PRNGKey(seed)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/_src/random.py", line 125, in PRNGKey
    key = prng.seed_with_impl(impl, seed)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/_src/prng.py", line 236, in seed_with_impl
    return PRNGKeyArray(impl, impl.seed(seed))
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/_src/prng.py", line 276, in threefry_seed
    lax.shift_right_logical(seed_arr, lax_internal._const(seed_arr, 32)))
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/_src/lax/lax.py", line 444, in shift_right_logical
    return shift_right_logical_p.bind(x, y)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/core.py", line 323, in bind
    return self.bind_with_trace(find_top_trace(args), args, params)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/core.py", line 326, in bind_with_trace
    out = trace.process_primitive(self, map(trace.full_raise, args), params)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/core.py", line 675, in process_primitive
    return primitive.impl(*tracers, **params)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/_src/dispatch.py", line 100, in apply_primitive
    return compiled_fun(*args)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/_src/dispatch.py", line 151, in <lambda>
    return lambda *args, **kw: compiled(*args, **kw)[0]
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/_src/dispatch.py", line 615, in _execute_compiled
    out_bufs_flat = compiled.execute(input_bufs_flat)
jaxlib.xla_extension.XlaRuntimeError: INTERNAL: Could not find the corresponding function
Traceback (most recent call last):
  File "app.py", line 65, in <module>
    dalle_model = DalleModel(args.model_version)
  File "/home/rhodeder/dalle-playground/backend/dalle_model.py", line 61, in __init__
    self.model, params = DalleBart.from_pretrained(
  File "/home/rhodeder/.local/lib/python3.8/site-packages/dalle_mini/model/utils.py", line 25, in from_pretrained
    return super(PretrainedFromWandbMixin, cls).from_pretrained(
  File "/home/rhodeder/.local/lib/python3.8/site-packages/transformers/modeling_flax_utils.py", line 596, in from_pretrained
    model = cls(config, *model_args, _do_init=_do_init, **model_kwargs)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/transformers/models/bart/modeling_flax_bart.py", line 920, in __init__
    super().__init__(config, module, input_shape=input_shape, seed=seed, dtype=dtype, _do_init=_do_init)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/transformers/modeling_flax_utils.py", line 115, in __init__
    self.key = PRNGKey(seed)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/_src/random.py", line 125, in PRNGKey
    key = prng.seed_with_impl(impl, seed)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/_src/prng.py", line 236, in seed_with_impl
    return PRNGKeyArray(impl, impl.seed(seed))
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/_src/prng.py", line 276, in threefry_seed
    lax.shift_right_logical(seed_arr, lax_internal._const(seed_arr, 32)))
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/_src/lax/lax.py", line 444, in shift_right_logical
    return shift_right_logical_p.bind(x, y)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/core.py", line 323, in bind
    return self.bind_with_trace(find_top_trace(args), args, params)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/core.py", line 326, in bind_with_trace
    out = trace.process_primitive(self, map(trace.full_raise, args), params)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/core.py", line 675, in process_primitive
    return primitive.impl(*tracers, **params)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/_src/dispatch.py", line 100, in apply_primitive
    return compiled_fun(*args)
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/_src/dispatch.py", line 151, in <lambda>
    return lambda *args, **kw: compiled(*args, **kw)[0]
  File "/home/rhodeder/.local/lib/python3.8/site-packages/jax/_src/dispatch.py", line 615, in _execute_compiled
    out_bufs_flat = compiled.execute(input_bufs_flat)
jaxlib.xla_extension.XlaRuntimeError: INTERNAL: Could not find the corresponding function

Zaithe avatar Jun 22 '22 21:06 Zaithe