dreambooth
dreambooth copied to clipboard
Output Docker container fails to run predictions
I'm running my private Replicate Docker image on my own Google Cloud VM instance after running a Dreambooth training.
Unfortunately, this command fails:
curl http://localhost:5000/predictions -X POST -H "Content-Type: application/json" -d '{"input":{"prompt":"a photo of zwx man", "width": 512, "height": 512}}'
It quickly fails with just "Internal Server Error". However, when looking at the Docker container logs, it seems like a prediction actually ran.
Using seed: 12294
using txt2img
INFO: 172.17.0.1:57596 - "POST /predictions HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 419, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
return await self.app(scope, receive, send)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/fastapi/applications.py", line 270, in __call__
await super().__call__(scope, receive, send)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/applications.py", line 124, in __call__
await self.middleware_stack(scope, receive, send)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
raise exc
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
await self.app(scope, receive, _send)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
raise exc
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
await self.app(scope, receive, sender)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
raise e
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
await self.app(scope, receive, send)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/routing.py", line 706, in __call__
await route.handle(scope, receive, send)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
await self.app(scope, receive, send)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/routing.py", line 66, in app
response = await func(request)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/fastapi/routing.py", line 235, in app
raw_response = await run_endpoint_function(
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/fastapi/routing.py", line 163, in run_endpoint_function
return await run_in_threadpool(dependant.call, **values)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
return await anyio.to_thread.run_sync(func, *args)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/cog/server/http.py", line 94, in predict
generic_response = runner.predict(request).get()
File "/root/.pyenv/versions/3.10.9/lib/python3.10/multiprocessing/pool.py", line 774, in get
raise self._value
File "/root/.pyenv/versions/3.10.9/lib/python3.10/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/cog/server/runner.py", line 84, in predict
handler.append_logs(event.message)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/cog/server/runner.py", line 142, in append_logs
assert self.p.logs
AssertionError
0%| | 0/50 [00:00<?, ?it/s]
2%|▏ | 1/50 [00:01<00:51, 1.05s/it]
6%|▌ | 3/50 [00:01<00:14, 3.21it/s]
12%|█▏ | 6/50 [00:01<00:06, 6.65it/s]
18%|█▊ | 9/50 [00:01<00:04, 9.68it/s]
24%|██▍ | 12/50 [00:01<00:03, 12.27it/s]
30%|███ | 15/50 [00:01<00:02, 14.33it/s]
36%|███▌ | 18/50 [00:01<00:02, 15.96it/s]
42%|████▏ | 21/50 [00:02<00:01, 17.23it/s]
48%|████▊ | 24/50 [00:02<00:01, 18.18it/s]
54%|█████▍ | 27/50 [00:02<00:01, 18.87it/s]
60%|██████ | 30/50 [00:02<00:01, 19.28it/s]
66%|██████▌ | 33/50 [00:02<00:00, 19.60it/s]
72%|███████▏ | 36/50 [00:02<00:00, 19.89it/s]
78%|███████▊ | 39/50 [00:02<00:00, 20.11it/s]
84%|████████▍ | 42/50 [00:03<00:00, 20.29it/s]
90%|█████████ | 45/50 [00:03<00:00, 20.41it/s]
96%|█████████▌| 48/50 [00:03<00:00, 20.48it/s]
100%|██████████| 50/50 [00:03<00:00, 14.49it/s]
What is going on at the HTTP layer here? Why can't I POST to my Docker container successfully?
This might be an issue with the recent work adding async support to https://github.com/replicate/cog
@achuinard to reproduce, I assume you trained this model recently? (can you provide the an ID from dreambooth training api?)
Yes, these are all recent model trainings. One recent prediction ID is mc4st5d7ozhv5gc25olufyotpi.
Thanks for getting back to me, @anotherjesse.
Also how is it working on Replicate if the cogs themselves are just broken...you must be doing some magic!
Or I just need to start using the async header and letting Cog webhook me.
@achuinard - we have switched over to cog's "async" style:
- async api for cog makes the requests look the same as replicate - enabling users to switch between the team (local cog vs replicate) in an easier manner
- async is better for larger systems :)
That said, if the sync API is breaking, that isn't good
To clarify - the prediction you shared is a dreambooth training. The cog you are running locally is:
- you downloading and running the cog image we built as part of dreambooth api?
- or did you download those weights and build your own dreambooth inference using cog directly (some of our users build their own cogs using https://github.com/replicate/dreambooth-template
@anotherjesse This is the Cog / Docker image that your Dreambooth API creates. I did not go through the process of building my own Cog.
async makes sense, I think I'm using it right:
curl http://localhost:5000/predictions -X POST -H "Prefer: respond-async" -H "Content-Type: application/json;charset=utf-8" -d '{"input":{"prompt":"a photo of zwx man", "width": 512, "height": 512, "disable_safety_check": true}, "webhook":"https://en58kxviypofc.x.pipedream.net"}'
Still no luck though.
Is there any documentation on async invocations using cog predict?
Also, I'm trying to use the PUT endpoint instead, passing in a generated UUID. Oddly, it just 404s on me. I must be doing something terribly wrong.
root@mutaro-image-generator:/home/tony_chuinard# curl -X PUT http://localhost:50000/predictions/6e2d263d-40a0-4219-ad62-ea12b7922da1 -H "Prefer: respond-async" -H "Content-Type: application/json" -d '{"input":{"prompt":"a photo of zwx man", "width": 512, "height": 512, "disable_safety_check": true}, "webhook":"https://en58kxviypofc.x.pipedream.net"}'
{"detail":"Not Found"}