dreambooth
dreambooth copied to clipboard
Output Docker container fails to run predictions
I'm running my private Replicate Docker image on my own Google Cloud VM instance after running a Dreambooth training.
Unfortunately, this command fails:
curl http://localhost:5000/predictions -X POST -H "Content-Type: application/json" -d '{"input":{"prompt":"a photo of zwx man", "width": 512, "height": 512}}'
It quickly fails with just "Internal Server Error". However, when looking at the Docker container logs, it seems like a prediction actually ran.
Using seed: 12294
using txt2img
INFO: 172.17.0.1:57596 - "POST /predictions HTTP/1.1" 500 Internal Server Error
ERROR: Exception in ASGI application
Traceback (most recent call last):
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 419, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 78, in __call__
return await self.app(scope, receive, send)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/fastapi/applications.py", line 270, in __call__
await super().__call__(scope, receive, send)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/applications.py", line 124, in __call__
await self.middleware_stack(scope, receive, send)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/middleware/errors.py", line 184, in __call__
raise exc
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/middleware/errors.py", line 162, in __call__
await self.app(scope, receive, _send)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 79, in __call__
raise exc
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 68, in __call__
await self.app(scope, receive, sender)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
raise e
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
await self.app(scope, receive, send)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/routing.py", line 706, in __call__
await route.handle(scope, receive, send)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/routing.py", line 276, in handle
await self.app(scope, receive, send)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/routing.py", line 66, in app
response = await func(request)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/fastapi/routing.py", line 235, in app
raw_response = await run_endpoint_function(
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/fastapi/routing.py", line 163, in run_endpoint_function
return await run_in_threadpool(dependant.call, **values)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
return await anyio.to_thread.run_sync(func, *args)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/cog/server/http.py", line 94, in predict
generic_response = runner.predict(request).get()
File "/root/.pyenv/versions/3.10.9/lib/python3.10/multiprocessing/pool.py", line 774, in get
raise self._value
File "/root/.pyenv/versions/3.10.9/lib/python3.10/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/cog/server/runner.py", line 84, in predict
handler.append_logs(event.message)
File "/root/.pyenv/versions/3.10.9/lib/python3.10/site-packages/cog/server/runner.py", line 142, in append_logs
assert self.p.logs
AssertionError
0%| | 0/50 [00:00<?, ?it/s]
2%|▏ | 1/50 [00:01<00:51, 1.05s/it]
6%|▌ | 3/50 [00:01<00:14, 3.21it/s]
12%|█▏ | 6/50 [00:01<00:06, 6.65it/s]
18%|█▊ | 9/50 [00:01<00:04, 9.68it/s]
24%|██▍ | 12/50 [00:01<00:03, 12.27it/s]
30%|███ | 15/50 [00:01<00:02, 14.33it/s]
36%|███▌ | 18/50 [00:01<00:02, 15.96it/s]
42%|████▏ | 21/50 [00:02<00:01, 17.23it/s]
48%|████▊ | 24/50 [00:02<00:01, 18.18it/s]
54%|█████▍ | 27/50 [00:02<00:01, 18.87it/s]
60%|██████ | 30/50 [00:02<00:01, 19.28it/s]
66%|██████▌ | 33/50 [00:02<00:00, 19.60it/s]
72%|███████▏ | 36/50 [00:02<00:00, 19.89it/s]
78%|███████▊ | 39/50 [00:02<00:00, 20.11it/s]
84%|████████▍ | 42/50 [00:03<00:00, 20.29it/s]
90%|█████████ | 45/50 [00:03<00:00, 20.41it/s]
96%|█████████▌| 48/50 [00:03<00:00, 20.48it/s]
100%|██████████| 50/50 [00:03<00:00, 14.49it/s]
What is going on at the HTTP layer here? Why can't I POST to my Docker container successfully?
This might be an issue with the recent work adding async support to https://github.com/replicate/cog
@achuinard to reproduce, I assume you trained this model recently? (can you provide the an ID from dreambooth training api?)
Yes, these are all recent model trainings. One recent prediction ID is mc4st5d7ozhv5gc25olufyotpi
.
Thanks for getting back to me, @anotherjesse.
Also how is it working on Replicate if the cogs themselves are just broken...you must be doing some magic!
Or I just need to start using the async header and letting Cog webhook me.
@achuinard - we have switched over to cog's "async" style:
- async api for cog makes the requests look the same as replicate - enabling users to switch between the team (local cog vs replicate) in an easier manner
- async is better for larger systems :)
That said, if the sync API is breaking, that isn't good
To clarify - the prediction you shared is a dreambooth training. The cog you are running locally is:
- you downloading and running the cog image we built as part of dreambooth api?
- or did you download those weights and build your own dreambooth inference using cog directly (some of our users build their own cogs using https://github.com/replicate/dreambooth-template
@anotherjesse This is the Cog / Docker image that your Dreambooth API creates. I did not go through the process of building my own Cog.
async makes sense, I think I'm using it right:
curl http://localhost:5000/predictions -X POST -H "Prefer: respond-async" -H "Content-Type: application/json;charset=utf-8" -d '{"input":{"prompt":"a photo of zwx man", "width": 512, "height": 512, "disable_safety_check": true}, "webhook":"https://en58kxviypofc.x.pipedream.net"}'
Still no luck though.
Is there any documentation on async invocations using cog predict
?
Also, I'm trying to use the PUT endpoint instead, passing in a generated UUID. Oddly, it just 404s on me. I must be doing something terribly wrong.
root@mutaro-image-generator:/home/tony_chuinard# curl -X PUT http://localhost:50000/predictions/6e2d263d-40a0-4219-ad62-ea12b7922da1 -H "Prefer: respond-async" -H "Content-Type: application/json" -d '{"input":{"prompt":"a photo of zwx man", "width": 512, "height": 512, "disable_safety_check": true}, "webhook":"https://en58kxviypofc.x.pipedream.net"}'
{"detail":"Not Found"}