Model reloads for every API request, adding additional 15sec to each inference request. Issue with indexing.
This is a critical bug, but probably a quick fix for the devs.
To get a little more objective, in API mode it's taking ~13-20 seconds for each API call, or ~3.0 seconds through the UI, for the same request. This is completely because of model reloading.
In case it helps, a sample of logs:
Requested to load FluxClipModel_ got prompt loaded partially 7709.3625 7709.36181640625 0 Requested to load Flux loaded partially 7709.3625 7708.8067626953125 239 100%|█████| 6/6 [00:02<00:00, 1.25s/it] Requested to load AutoencodingEngine loaded completely 392.293359375 159.87335777282715 True Prompt executed in 20.23 seconds Requested to load FluxClipModel_ loaded partially 7709.3625 7709.36181640625 0 Requested to load Flux loaded partially 7709.3625 7708.8067626953125 239 100%|█████| 6/6 [00:07<00:00, 1.26s/it] Requested to load AutoencodingEngine loaded completely 392.293359375 159.87335777282715 True Prompt executed in 13.32 seconds got prompt Requested to load FluxClipModel_ loaded partially 7709.3625 7709.36181640625 0 Requested to load Flux loaded partially 7709.3625 7708.8067626953125 239 ...On a whim, I loaded the api workflow in the UI to see if that could somehow force the API model to remain loaded, but alas it does not.
So it appears that
comfy.model_management.load_models_gpuis getting called each time, and if loaded through the API interface, it appears there's an indexing mismatch, so this is probably a quick fix:for x in models: loaded_model = LoadedModel(x) try: loaded_model_index = current_loaded_models.index(loaded_model) except Exception as e: loaded_model_index = None logging.info(f'current_loaded_models.index fail: {e}') # <<<<<<<<<<<<<<<<<<<<<< if loaded_model_index is not None: loaded = current_loaded_models[loaded_model_index] loaded.currently_used = True models_to_load.append(loaded) else: if hasattr(x, "model"): logging.info(f"Requested to load {x.model.__class__.__name__}") models_to_load.append(loaded_model)Indeed, logging the
exceptbranch shows it's failing each time to find the API-loaded model:current_loaded_models.index fail: <comfy.model_management.LoadedModel object at 0x75c8e718b700> is not in list current_loaded_models.index fail: <comfy.model_management.LoadedModel object at 0x75c8e718a233> is not in list current_loaded_models.index fail: <comfy.model_management.LoadedModel object at 0x75c8e718a212> is not in list current_loaded_models.index fail: <comfy.model_management.LoadedModel object at 0x75c8e718a368> is not in list ...And note, in UI-only mode, this only happens the first time the model is loaded, and successfully finds the loaded models thereafter.
Originally posted by @freckletonj in #2503
yes,a big bug in api
I'm happy to jump in and help. Could someone from the team point me to relevant portions of the code I'll need to touch?
I'm still willing to help if I get some direction <3
@mcmonkey4eva I saw you were responding on this related issue, are you the right person to ask, or do you know who is?
+1
I face the same issue, any help would be appreciated.