dalle-playground icon indicating copy to clipboard operation
dalle-playground copied to clipboard

RESOURCE_EXHAUSTED

Open Allesanddro opened this issue 2 years ago • 5 comments

dalle-backend | --> Starting DALL-E Server. This might take up to two minutes. Traceback (most recent call last):aded (0.000 MB deduped)ed (0.000 MB deduped) dalle-backend | File "app.py", line 65, in <module> dalle-backend | dalle_model = DalleModel(args.model_version) dalle-backend | File "/app/dalle_model.py", line 70, in __init__ dalle-backend | self.params = replicate(params) dalle-backend | File "/usr/local/lib/python3.8/dist-packages/flax/jax_utils.py", line 56, in replicate dalle-backend | return jax.device_put_replicated(tree, devices) dalle-backend | File "/usr/local/lib/python3.8/dist-packages/jax/_src/api.py", line 2940, in device_put_replicated dalle-backend | return tree_map(_device_put_replicated, x) dalle-backend | File "/usr/local/lib/python3.8/dist-packages/jax/_src/tree_util.py", line 200, in tree_map dalle-backend | return treedef.unflatten(f(*xs) for xs in zip(*all_leaves)) dalle-backend | File "/usr/local/lib/python3.8/dist-packages/jax/_src/tree_util.py", line 200, in <genexpr> dalle-backend | return treedef.unflatten(f(*xs) for xs in zip(*all_leaves)) dalle-backend | File "/usr/local/lib/python3.8/dist-packages/jax/_src/api.py", line 2928, in _device_put_replicated dalle-backend | buf, = dispatch.device_put(x, devices[0]) dalle-backend | File "/usr/local/lib/python3.8/dist-packages/jax/_src/dispatch.py", line 1202, in device_put dalle-backend | return device_put_handlers[type(x)](x, device) dalle-backend | File "/usr/local/lib/python3.8/dist-packages/jax/_src/dispatch.py", line 1234, in _device_put_device_array dalle-backend | x = _copy_device_array_to_device(x, device) dalle-backend | File "/usr/local/lib/python3.8/dist-packages/jax/_src/dispatch.py", line 1261, in _copy_device_array_to_device dalle-backend | moved_buf = backend.buffer_from_pyval(np.asarray(x.device_buffer), device) dalle-backend | jaxlib.xla_extension.XlaRuntimeError: RESOURCE_EXHAUSTED: Failed to allocate request for 392.75MiB (411828224B) on device ordinal 0

I use 2x 1060 6GB version so I don't understand why it fails to allocate 400MB of memory

Allesanddro avatar Oct 11 '22 05:10 Allesanddro

I got the same error for 380 but just have a single 2080 super

VeXHarbinger avatar Oct 13 '22 07:10 VeXHarbinger

There is a temporary workaround which is recreating the image then it will work for a few hours-days

Allesanddro avatar Oct 13 '22 08:10 Allesanddro

There is a temporary workaround which is recreating the image then it will work for a few hours-days

Although I suppose I'm too new to docker to know what you mean. I was so happy I got this deep

VeXHarbinger avatar Oct 19 '22 03:10 VeXHarbinger

stop the containers delete them delete your image that got build start again

docker-compose down docker rm dalle-playground_dalle-backend docker rm dalle-playground_dalle-interface docker rmi dalle-playground_dalle-backend:latest docker rmi dalle-playground_dalle-interface:latest docker system prune #ONLY RUN THIS IF YOU HAVE NO OTHER CONTAINERS THAT ARE STOPPED //Remove all unused containers, networks, images (both dangling and unreferenced), and optionally, volumes. docker-compose up -d

Allesanddro avatar Oct 19 '22 13:10 Allesanddro

TYVM for the steps, I'll try them out this evening.

Tried them with no success; RESOURCE_EXHAUSTED: Failed to allocate request for 384.00MiB (402653184B) on device ordinal 0

VeXHarbinger avatar Oct 19 '22 14:10 VeXHarbinger