ComfyUI icon indicating copy to clipboard operation
ComfyUI copied to clipboard

Model reloads for every API request, adding additional 15sec to each inference request. Issue with indexing.

Open freckletonj opened this issue 10 months ago • 4 comments

This is a critical bug, but probably a quick fix for the devs.

To get a little more objective, in API mode it's taking ~13-20 seconds for each API call, or ~3.0 seconds through the UI, for the same request. This is completely because of model reloading.

In case it helps, a sample of logs:

Requested to load FluxClipModel_
got prompt
loaded partially 7709.3625 7709.36181640625 0
Requested to load Flux
loaded partially 7709.3625 7708.8067626953125 239
100%|█████| 6/6 [00:02<00:00,  1.25s/it]
Requested to load AutoencodingEngine
loaded completely 392.293359375 159.87335777282715 True
Prompt executed in 20.23 seconds
Requested to load FluxClipModel_
loaded partially 7709.3625 7709.36181640625 0
Requested to load Flux
loaded partially 7709.3625 7708.8067626953125 239
100%|█████| 6/6 [00:07<00:00,  1.26s/it]
Requested to load AutoencodingEngine
loaded completely 392.293359375 159.87335777282715 True
Prompt executed in 13.32 seconds
got prompt
Requested to load FluxClipModel_
loaded partially 7709.3625 7709.36181640625 0
Requested to load Flux
loaded partially 7709.3625 7708.8067626953125 239
...

On a whim, I loaded the api workflow in the UI to see if that could somehow force the API model to remain loaded, but alas it does not.

So it appears that comfy.model_management.load_models_gpu is getting called each time, and if loaded through the API interface, it appears there's an indexing mismatch, so this is probably a quick fix:

    for x in models:
        loaded_model = LoadedModel(x)
        try:
            loaded_model_index = current_loaded_models.index(loaded_model)
        except Exception as e:
            loaded_model_index = None
            logging.info(f'current_loaded_models.index fail: {e}')   # <<<<<<<<<<<<<<<<<<<<<<

        if loaded_model_index is not None:
            loaded = current_loaded_models[loaded_model_index]
            loaded.currently_used = True
            models_to_load.append(loaded)
        else:
            if hasattr(x, "model"):
                logging.info(f"Requested to load {x.model.__class__.__name__}")
            models_to_load.append(loaded_model)

Indeed, logging the except branch shows it's failing each time to find the API-loaded model:

current_loaded_models.index fail: <comfy.model_management.LoadedModel object at 0x75c8e718b700> is not in list
current_loaded_models.index fail: <comfy.model_management.LoadedModel object at 0x75c8e718a233> is not in list
current_loaded_models.index fail: <comfy.model_management.LoadedModel object at 0x75c8e718a212> is not in list
current_loaded_models.index fail: <comfy.model_management.LoadedModel object at 0x75c8e718a368> is not in list
...

And note, in UI-only mode, this only happens the first time the model is loaded, and successfully finds the loaded models thereafter.

Originally posted by @freckletonj in #2503

freckletonj avatar Feb 20 '25 21:02 freckletonj

yes,a big bug in api

klausHou avatar Feb 24 '25 04:02 klausHou

I'm happy to jump in and help. Could someone from the team point me to relevant portions of the code I'll need to touch?

freckletonj avatar Feb 28 '25 23:02 freckletonj

I'm still willing to help if I get some direction <3

@mcmonkey4eva I saw you were responding on this related issue, are you the right person to ask, or do you know who is?

freckletonj avatar Mar 07 '25 21:03 freckletonj

+1

I face the same issue, any help would be appreciated.

derhuebiii avatar Apr 24 '25 14:04 derhuebiii