ComfyUI_frontend
ComfyUI_frontend copied to clipboard
Improve performance when large number of prompts are queued
Is there an existing issue for this?
- [x] I have searched the existing issues and checked the recent builds/commits
What would your feature do ?
There are some users who queue a large number of prompts at a time with large graphs. It is commonly reported from them that overall performance suffers a lot when doing this. The issue is the frequency and eagerness of requests/messages to the queue endpoint.
Reproduction Steps
- Open DevTools in the browser (F12)
- Go to the Network tab
- In the search/filter bar type 'queue'
- Go back to the graph without closing DevTools
- Create a workflow with a lot of nodes
- Increase the batch count on the
Runbutton to a high number - Press
Run - Observe the
/queueresponses in the Network panel: - Notice that:
- The
/queueresponse includes the entire workflow for every single pending task (along with other fields). The type is here: https://github.com/Comfy-Org/ComfyUI_frontend/blob/8db088b27a5669e423b1d2644fa574bf4a3c3d85/src/schemas/apiSchema.ts#L185-L193 - Requests to
/queueare eager - The
queueStoreupdates the queue and history in the sameupdateaction. As a result,/queuerequests are made when tasks are created and when tasks are completed, which is probabaly not necessary. - Requests to
/queueare made for each task created, even those created in batches
- The
Possible solutions
- Send prompts to server less eagerly when batch size of queue button is large
- Only request necessary properties of queued prompts
- Don't repeatedly request information about queued items since they are not subject to change
- Add
max_itemsparam to queue endpoint, similar to history - Evaluate graph less eagerly or use DP, although the issue seems to be with the large frequent requests rather than graph evaluation
Proposed workflow
- Load large graph
- Increase batch count to 200+ e.g., to run over-night or while out of house
- Press queue button
- The app works at same efficiency/speed as it does during normal usage
Additional information
The /history endpoint, used for completed tasks, faced the same issue in the past. It was solved by limiting the number of items returned in the request (https://github.com/Comfy-Org/ComfyUI_frontend/pull/393).
The maxItems arg is supported in the /history route handler on the backend, but not for the /queue handler.
For reference, implementation of max_items for /history:
Related issues
- https://github.com/comfyanonymous/ComfyUI/issues/2000#issuecomment-2631586686
- https://github.com/comfyanonymous/ComfyUI/discussions/7284
- https://github.com/comfyanonymous/ComfyUI/issues/6939
- https://github.com/comfyanonymous/ComfyUI/issues/6676
- https://github.com/comfyanonymous/ComfyUI/issues/5073
- https://github.com/comfyanonymous/ComfyUI/issues/2000
- https://github.com/comfyanonymous/ComfyUI/issues/8070
┆Issue is synchronized with this Notion page by Unito
Is there any progress on this? I completely disabled all /history requests to see if it would help, but it still slows down significantly after ~80-100 items in queue.
There's no progress yet. Potentially some OSS contributor can be a hero and solve this.
To my knowledge, the issue is actually with the /queue endpoint. I just updated the issue body with more details. You may be able to adjust your solution for the /queue endpoint as a temporary workaround.
i didn't used vue in my life, but do you think if we gzipped the payload it would be better? then i think we will need the backend to expect the new gzipped data i can try doing it but i need a way to "simulate" running the app without killing my laptop
Yes, I think that could work.
I'll start working on it, but don't get your hopes up since I haven't used Vue before. Do you have a way to simulate the backend?
First, I'll try gzipping. If the improvement isn't significant, I'll create a new endpoint for batched data with gzip compression. That way, the complex graph will be minimized, and only the dynamic parts, like the seed, will take up space on the request.
However, creating the batch endpoint might be a problem since the backend would need modifications, and it's in a separate repository, is that would be fine?
I don't know of an easy way to quickly simulate the backend, but you can start the backend server with the --cpu arg to run on a low-end machine or laptop. And Development section of README for setting up frontend dev server if you have not seen it.
Regarding the compression idea, please also note this: https://github.com/comfyanonymous/ComfyUI/pull/7231#issuecomment-2726306884. I wonder if using this option can resolve the issue somewhat.
I just tested the flag, and the results are a bit strange.
Test Setup
I loaded the models in both runs and tried to keep the conditions as similar as possible.
Run 1: Without --enable-compress-response-body
- Time to load 100 requests: 1 minute 36 seconds (00:01:36.80).
- The last ~15 requests had huge payloads (~4MB each), increasing quickly.
- Each request took around 10 seconds to resolve.
Run 2: With --enable-compress-response-body
- Time to load 100 requests: 1 minute 50 seconds (00:01:50.80).
- The last ~15 requests had small payloads (~640KB, consistent size).
- However, the last 15 requests took more than 30 seconds each to resolve???
- I also got this warning on all prompts:
UserWarning: Synchronous compression of large response bodies (1934928 bytes) might block the async event loop. Consider providing a custom value to zlib_executor_size/zlib_executor response properties or disabling compression on it.
Analysis & Solution
It looks like this option can fully fix the payload size issue, but the backend needs a small modification.
The problem seems to be that Python's event loop is getting blocked because the compression is happening synchronously.
Possible Fixes: (GPT suggestions)
- Use
aiohttpand configure compression properly to avoid blocking the event loop. - Adjust
zlib_executor_size/zlib_executorto offload compression to a separate thread. - Disable compression for very large responses and only apply it selectively.
If the backend is modified correctly, this should work without causing delays.
and when i went to the backend i couldn't find any option to put the zlib_executor on seperate thread the code should be like this
from aiohttp import web
app = web.Application()
app['zlib_executor_size'] = 4 # Increase the number of threads for compression
async def handler(request):
data = "Large response content" * 100000 # Simulating a large response
return web.Response(text=data, compress=True)
app.router.add_get("/", handler)
web.run_app(app)
or we create our own thread pool and assign it
import asyncio
from concurrent.futures import ThreadPoolExecutor
from aiohttp import web
app = web.Application()
# Create a thread pool executor for compression tasks
compression_executor = ThreadPoolExecutor(max_workers=4)
app['zlib_executor'] = compression_executor # Assign it to aiohttp
async def handler(request):
data = "Large response content" * 100000
return web.Response(text=data, compress=True)
app.router.add_get("/", handler)
loop = asyncio.get_event_loop()
web.run_app(app, loop=loop)
Is it really the size of the network transfer that is the issue? On loopback (running locally), I can transfer in many GB/s, yet the throughput of the queuing process is only a small fraction of that. Despite this, it is still very slow.
It may help with transfer overhead for remote connections, but I don't think gzipping/compressing the transfer is going to improve the speed for queuing many runs in a row.
I think it's much more likely it's the parsing/processing of the entire node-graph after it's been sent (or perhaps before, when it's getting serialized) that is the issue, not the transfer itself.
Hi, in https://github.com/comfyanonymous/ComfyUI/issues/6939 I suggest to "Queue up all items before starting", do you think that this is a misdiagnosing of the problem?
It seemed like it was lagging because the workflow was running at the same time it was queuing things. But maybe you guys would say that this doesn't matter at all?
must be solved
For anyone want to quick fix this issue, just add one line to the file: venv\Lib\site-packages\comfyui_frontend_package\static\assets\index-C1a9-oeR.js
line: 60543
async getQueue() {
//fix queue slow
return { Running: [], Pending: [] };
I just add 900 tasks to queue in 10s, and running good.
This bug already exist for months, seems no one care. According to the code, each time add one need fetch all. so technically it is N*(N-1) or somehow heavy operations?
I wrote a small change making Comfy a lot faster when the prompt queue is large. It won't work for thousands of queue items, but it's good enough for several hundred.
https://github.com/comfyanonymous/ComfyUI/pull/8176
I suggest doing 2 things:
- Mark as a bug rather than a feature request 😜
- Consider adding paging (On top of the improvements suggested (i.e. by @miabrahams and @catboxanon)) and return only portion of the queue rather than the whole thing. This will cap the performance impact at the maximum page size (say 100 items). I know it might need some brain cells to muscle up since right now the queue uses heap for storage and ain't particularly "pageable" on its own.
Mark as a bug rather than a feature request 😜
Good point, done!
Consider adding paging (On top of the improvements suggested (i.e. by miabrahams and catboxanon)) and return only portion of the queue rather than the whole thing. This will cap the performance impact at the maximum page size (say 100 items). I know it might need some brain cells to muscle up since right now the queue uses heap for storage and ain't particularly "pageable" on its own.
Good idea. We would also need to adjust the logic a bit on the client side, as some components like the queue sidebar rely on the information.
Since https://github.com/comfyanonymous/ComfyUI/pull/8176 is merged, we should monitor the performance and any community feedback for a while, then determine if more improvements are needed on client side.
Based on user reports and communtiy sentiment, it seems this issue is mostly solved by https://github.com/comfyanonymous/ComfyUI/pull/8176. Of course, the performance is not leetcode optimal, but it only seems to be affecting those who are using ComfyUI as a backend for a more involved application or use case -- and in those cases, the optimzations necessary might be the burden of the developers involved or require more comprehensive efforts on backend.
If you still experience this issue (arriving later), make sure you are on ComfyUI version afte https://github.com/comfyanonymous/ComfyUI/pull/8176 and that the issue is not caused by some system- or environment-specific conditions. If you've confirmed both those, please report below and I will re-open.
Thanks again to miabrahams for fixing this issue.