SwarmUI icon indicating copy to clipboard operation
SwarmUI copied to clipboard

Add Segmentation Ordering for Yolo segmentation

Open aimerib opened this issue 1 year ago • 28 comments

Allow yolo segmentation to sort by left-right, top-bottom, largest-smallest, and reverse that order. Params added to Regional Segmentation Group

aimerib avatar Sep 26 '24 13:09 aimerib

thank you so much @aimerib. what will be the new sorting command for this

e.g. modifying this one for only inpainting maximum segment

<segment:yolo-face_yolov9c.pt-1,0.6,0.5>

FurkanGozukara avatar Sep 26 '24 14:09 FurkanGozukara

The command stays the same, and under your parameters list, you should have a Regional Prompting group that you can expand. If you want to segment only the largest found face, toggle on sorting segment, select largest-smallest, and that's it. It will also allow you to sort left-right, top-bottom, and reverse that too.

aimerib avatar Sep 26 '24 14:09 aimerib

actually: <segment:yolo-face_yolov9c.pt,0.6,0.5> don't pass an index. it will ignore it anyways.

aimerib avatar Sep 26 '24 14:09 aimerib

actually: <segment:yolo-face_yolov9c.pt,0.6,0.5> don't pass an index. it will ignore it anyways.

but i want to only inpaint first face is that not possible?

FurkanGozukara avatar Sep 26 '24 14:09 FurkanGozukara

I thought the goal was to inpaint the largest face, right? This will detect the largest, no matter the position. If you want only the first found face, then you keep the index. If the use-case is different, do you mind giving me a more detailed scenario so I can mimic it locally?

aimerib avatar Sep 26 '24 14:09 aimerib

I thought the goal was to inpaint the largest face, right? This will detect the largest, no matter the position. If you want only the first found face, then you keep the index. If the use-case is different, do you mind giving me a more detailed scenario so I can mimic it locally?

it is like this

lets say it found 3 faces

only inpaint first one sorted by max area

and thank you so much

image

FurkanGozukara avatar Sep 26 '24 14:09 FurkanGozukara

So, for this scenario:

use your regular <segment:yolo-face_yolov9c.pt,0.6,0.5> (no index), but in the params, go to regional prompting, turn on Segmentation Ordering, and select "largest-smallest", and run your generation. What you described is the exact usecase I wrote this for. It should only pick up your main face in that image above, nothing else necessary.

aimerib avatar Sep 26 '24 14:09 aimerib

So, for this scenario:

use your regular <segment:yolo-face_yolov9c.pt,0.6,0.5> (no index), but in the params, go to regional prompting, turn on Segmentation Ordering, and select "largest-smallest", and run your generation. What you described is the exact usecase I wrote this for. It should only pick up your main face in that image above, nothing else necessary.

Awesome this solves my issue

But what if someone needs to inpaint 2 different person with different prompt what they can do?

FurkanGozukara avatar Sep 26 '24 14:09 FurkanGozukara

I think I mentioned in the closed issue, but my code only returns the first match. There's room for improving on that exact scenario. I figured that implementing just first match initially was a low-risk way to test that this works. Then we can add the sorting to the scenario where the user passes an index as well. I'm shooting for minimal code changes until we validate that this works well and doesn't break anything.

aimerib avatar Sep 26 '24 14:09 aimerib

I think I mentioned in the closed issue, but my code only returns the first match. There's room for improving on that exact scenario. I figured that implementing just first match initially was a low-risk way to test that this works. Then we can add the sorting to the scenario where the user passes an index as well. I'm shooting for minimal code changes until we validate that this works well and doesn't break anything.

Awesome make sense. I am waiting merge to test asap

FurkanGozukara avatar Sep 26 '24 14:09 FurkanGozukara

I believe this new commit should hopefully address most concerns. Primarily, simplify params, support indexes, don't use boxes and use the existing sorting, just expand with custom options.

abaddouh avatar Sep 26 '24 21:09 abaddouh

i just locally merged and tested

index works as expected

smallest largest sort works too amazing

FurkanGozukara avatar Sep 27 '24 12:09 FurkanGozukara

I'm not sure when smallest to largest would be helpful, but it made sense to add it as an inverse. Depending on the model you use, it might detect a small face on a piece of bread if you don't setup the threshold correctly. Well, now you could target that face too... hahaha

aimerib avatar Sep 27 '24 12:09 aimerib

ok there is an edge error case - i think when no face is found

2024-09-27 12:34:50.853 [Debug] [ComfyUI-4/STDERR] !!! Exception during processing !!! At least one stride in the given numpy array is negative, and tensors with negative strides are not currently supported. (You can probably work around this by making a copy of your array  with array.copy().) 
2024-09-27 12:34:50.854 [Warning] [ComfyUI-4/STDERR] Traceback (most recent call last):
2024-09-27 12:34:50.854 [Warning] [ComfyUI-4/STDERR]   File "/home/Ubuntu/apps/StableSwarmUI/dlbackend/ComfyUI/execution.py", line 323, in execute
2024-09-27 12:34:50.854 [Warning] [ComfyUI-4/STDERR]     output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
2024-09-27 12:34:50.854 [Warning] [ComfyUI-4/STDERR]   File "/home/Ubuntu/apps/StableSwarmUI/dlbackend/ComfyUI/execution.py", line 198, in get_output_data
2024-09-27 12:34:50.854 [Warning] [ComfyUI-4/STDERR]     return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
2024-09-27 12:34:50.854 [Warning] [ComfyUI-4/STDERR]   File "/home/Ubuntu/apps/StableSwarmUI/dlbackend/ComfyUI/execution.py", line 169, in _map_node_over_list
2024-09-27 12:34:50.854 [Warning] [ComfyUI-4/STDERR]     process_inputs(input_dict, i)
2024-09-27 12:34:50.855 [Warning] [ComfyUI-4/STDERR]   File "/home/Ubuntu/apps/StableSwarmUI/dlbackend/ComfyUI/execution.py", line 158, in process_inputs
2024-09-27 12:34:50.855 [Warning] [ComfyUI-4/STDERR]     results.append(getattr(obj, func)(**inputs))
2024-09-27 12:34:50.855 [Warning] [ComfyUI-4/STDERR]   File "/home/Ubuntu/apps/StableSwarmUI/src/BuiltinExtensions/ComfyUIBackend/ExtraNodes/SwarmComfyExtra/SwarmYolo.py", line 48, in seg
2024-09-27 12:34:50.855 [Warning] [ComfyUI-4/STDERR]     sortedindices = torch.tensor(np.ascontiguousarray(sortedindices))
2024-09-27 12:34:50.855 [Warning] [ComfyUI-4/STDERR] ValueError: At least one stride in the given numpy array is negative, and tensors with negative strides are not currently supported. (You can probably work around this by making a copy of your array  with array.copy().) 
2024-09-27 12:34:50.855 [Warning] [ComfyUI-4/STDERR] 
2024-09-27 12:34:50.856 [Debug] [ComfyUI-4/STDERR] Prompt executed in 1.05 seconds
2024-09-27 12:34:50.859 [Debug] Failed to process comfy workflow for inputs T2IParamInput(prompt: close shot photo of ohwx man standing on the moon's surface, surrounded by a breathtaking view of Earth in the sky, the stars glittering in the dark void, wearing a futuristic space suit with chrome accents, sleek aerodynamic design, featuring small blue L..., model: FT_256_images_BS7_5e_01_xFormers-000040, seed: 56941881, steps: 40, cfgscale: 1, aspectratio: 1:1, width: 1024, height: 1024, sampler: ipndm, fluxguidancescale: 4, segmentationsortorder: largest-smallest, automaticvae: True, preferreddtype: default, negativeprompt: ) with raw workflow { "4": { "class_type": "UNETLoader", "inputs": { "unet_name": "FT_256_images_BS7_5e_01_xFormers-000040.safetensors", "weight_dtype": "default" } }, "100": { "class_type": "DualCLIPLoader", "inputs": { "clip_name1": "t5xxl_enconly.safetensors", "clip_name2": "clip_l.safetensors", "type": "flux" } }, "101": { "class_type": "VAELoader", "inputs": { "vae_name": "ae.safetensors" } }, "5": { "class_type": "EmptySD3LatentImage", "inputs": { "batch_size": 1, "height": 1024, "width": 1024 } }, "6": { "class_type": "SwarmClipTextEncodeAdvanced", "inputs": { "clip": [ "100", 0 ], "steps": 40, "prompt": "close shot photo of ohwx man standing on the moon's surface, surrounded by a breathtaking view of Earth in the sky, the stars glittering in the dark void, wearing a futuristic space suit with chrome accents, sleek aerodynamic design, featuring small blu...", "width": 1536, "height": 1536, "target_width": 1024, "target_height": 1024, "guidance": 4 } }, "7": { "class_type": "SwarmClipTextEncodeAdvanced", "inputs": { "clip": [ "100", 0 ], "steps": 40, "prompt": "", "width": 832, "height": 832, "target_width": 1024, "target_height": 1024, "guidance": 4 } }, "10": { "class_type": "SwarmKSampler", "inputs": { "model": [ "4", 0 ], "noise_seed": 56941881, "steps": 40, "cfg": 1, "sampler_name": "ipndm", "scheduler": "simple", "positive": [ "6", 0 ], "negative": [ "7", 0 ], "latent_image": [ "5", 0 ], "start_at_step": 0, "end_at_step": 10000, "return_with_leftover_noise": "disable", "add_noise": "enable", "control_after_generate": "fixed", "var_seed": 0, "var_seed_strength": 0, "sigma_min": -1, "sigma_max": -1, "rho": 7, "previews": "default", "tile_sample": False, "tile_size": 1024 } }, "8": { "class_type": "VAEDecode", "inputs": { "vae": [ "101", 0 ], "samples": [ "10", 0 ] } }, "102": { "class_type": "SwarmYoloDetection", "inputs": { "image": [ "8", 0 ], "model_name": "face_yolov9c.pt", "index": 1, "sort_order": "largest-smallest" } }, "103": { "class_type": "SwarmMaskBlur", "inputs": { "mask": [ "102", 0 ], "blur_radius": 10, "sigma": 1 } }, "104": { "class_type": "GrowMask", "inputs": { "mask": [ "103", 0 ], "expand": 16, "tapered_corners": True } }, "105": { "class_type": "SwarmMaskThreshold", "inputs": { "mask": [ "104", 0 ], "min": 0.01, "max": 1 } }, "106": { "class_type": "SwarmMaskBounds", "inputs": { "mask": [ "105", 0 ], "grow": 8 } }, "107": { "class_type": "SwarmImageCrop", "inputs": { "image": [ "8", 0 ], "x": [ "106", 0 ], "y": [ "106", 1 ], "width": [ "106", 2 ], "height": [ "106", 3 ] } }, "108": { "class_type": "CropMask", "inputs": { "mask": [ "105", 0 ], "x": [ "106", 0 ], "y": [ "106", 1 ], "width": [ "106", 2 ], "height": [ "106", 3 ] } }, "109": { "class_type": "SwarmImageScaleForMP", "inputs": { "image": [ "107", 0 ], "width": 1024, "height": 1024, "can_shrink": True } }, "110": { "class_type": "VAEEncode", "inputs": { "vae": [ "101", 0 ], "pixels": [ "109", 0 ] } }, "111": { "class_type": "SetLatentNoiseMask", "inputs": { "samples": [ "110", 0 ], "mask": [ "108", 0 ] } }, "112": { "class_type": "DifferentialDiffusion", "inputs": { "model": [ "4", 0 ] } }, "113": { "class_type": "SwarmClipTextEncodeAdvanced", "inputs": { "clip": [ "100", 0 ], "steps": 40, "prompt": "photo of ohwx man", "width": 1536, "height": 1536, "target_width": 1024, "target_height": 1024, "guidance": 4 } }, "114": { "class_type": "SwarmClipTextEncodeAdvanced", "inputs": { "clip": [ "100", 0 ], "steps": 40, "prompt": "", "width": 832, "height": 832, "target_width": 1024, "target_height": 1024, "guidance": 4 } }, "115": { "class_type": "SwarmKSampler", "inputs": { "model": [ "4", 0 ], "noise_seed": 56941883, "steps": 40, "cfg": 1, "sampler_name": "ipndm", "scheduler": "simple", "positive": [ "113", 0 ], "negative": [ "114", 0 ], "latent_image": [ "111", 0 ], "start_at_step": 9, "end_at_step": 10000, "return_with_leftover_noise": "disable", "add_noise": "enable", "control_after_generate": "fixed", "var_seed": 0, "var_seed_strength": 0, "sigma_min": -1, "sigma_max": -1, "rho": 7, "previews": "default", "tile_sample": False, "tile_size": 1024 } }, "116": { "class_type": "VAEDecode", "inputs": { "vae": [ "101", 0 ], "samples": [ "115", 0 ] } }, "117": { "class_type": "ImageScale", "inputs": { "image": [ "116", 0 ], "width": [ "106", 2 ], "height": [ "106", 3 ], "upscale_method": "bilinear", "crop": "disabled" } }, "118": { "class_type": "ThresholdMask", "inputs": { "mask": [ "108", 0 ], "value": 0.001 } }, "119": { "class_type": "ImageCompositeMasked", "inputs": { "destination": [ "8", 0 ], "source": [ "117", 0 ], "mask": [ "118", 0 ], "x": [ "106", 0 ], "y": [ "106", 1 ], "resize_source": False } }, "9": { "class_type": "SwarmSaveImageWS", "inputs": { "images": [ "119", 0 ], "bit_depth": "8bit" } } }
2024-09-27 12:34:50.859 [Debug] Refused to generate image for local: ComfyUI execution error: At least one stride in the given numpy array is negative, and tensors with negative strides are not currently supported. (You can probably work around this by making a copy of your array  with array.copy().) 

FurkanGozukara avatar Sep 27 '24 12:09 FurkanGozukara

Oh! I will fix that right away! Good catch!

aimerib avatar Sep 27 '24 12:09 aimerib

@aimerib also if there is only 1 face will index 1 still work right?

like

<segment:face-1,0.6,0.5>

<segment:yolo-face_yolov9c.pt-1,0.6,0.5>

FurkanGozukara avatar Sep 27 '24 12:09 FurkanGozukara

It should, but I will test it while I'm fixing the other error you found

aimerib avatar Sep 27 '24 12:09 aimerib

Basically, we already did ordering before, just always left-right, so using index 1 with only one face found would return that face

aimerib avatar Sep 27 '24 12:09 aimerib

@FurkanGozukara what was the specific segment command you used in the prompt that generated that error? I see you used the segmentationordering: "largest-smallest" so I'm trying with that too, but having a hard time hitting the same error

aimerib avatar Sep 27 '24 13:09 aimerib

@aimerib i found another bug

when sorting enabled, it can't find any face

but when it is disabled it can find

here example

photo of ohwx man riding a majestic, muscular white tiger through a dense mystical forest, with trees towering overhead, their twisted branches forming an intricate canopy. The air is filled with glowing fireflies and floating specks of light, creating an ethereal atmosphere. Ohwx wears a regal, intricately embroidered tunic woven with golden threads depicting ancient symbols, a flowing cape with a high collar that shimmers in the dim, magical light. His leather boots have silver buckles, each carved with ornate designs. The tiger's fur glows in the moonlight, and its striking blue eyes mirror Ohwx's vigilant expression. <segment:yolo-face_yolov9c.pt,0.8,0.5>photo of ohwx man

sorting disabled works

image

sorting enabled

image

image

FurkanGozukara avatar Sep 27 '24 13:09 FurkanGozukara

Thank you! I was able to reproduce this error. I've pushed a new commit that should account for the error you were finding. Can you check both scenarios you described above when you have a moment? I really appreciate the help testing these cases!

aimerib avatar Sep 27 '24 13:09 aimerib

@aimerib excellent job

the case that was giving error previously didnt give any error

i am gonna do more testing now hopefully

FurkanGozukara avatar Sep 27 '24 20:09 FurkanGozukara

by the way i generated over 2k images had no errors with last version

FurkanGozukara avatar Sep 28 '24 18:09 FurkanGozukara

@mcmonkey4eva can you finalize this? perhaps fix the remaining part? I really would like to show this at next tutorial.

Thank you so much all

FurkanGozukara avatar Oct 05 '24 00:10 FurkanGozukara

@FurkanGozukara I'll work on finishing with the feedback and submit a PR at some point tonight or tomorrow morning at the latest. Sorry about the delay.

aimerib avatar Oct 05 '24 00:10 aimerib

still waiting :/ thank you so much

FurkanGozukara avatar Oct 10 '24 21:10 FurkanGozukara

Yeah, it will be a while longer. There was a massive hurricane here this week, and my family and I had to evacuate out of state. Our town is still full of flooding so it will be a while before I can look at this again. I’m thinking at least after this weekend.

aimerib avatar Oct 10 '24 21:10 aimerib

@aimerib thanks i hope you get better asap without any issues

FurkanGozukara avatar Oct 10 '24 21:10 FurkanGozukara

shown this in new tutorial i hope gets merged as soon as possible : https://youtu.be/FvpWy1x5etM

FurkanGozukara avatar Oct 22 '24 14:10 FurkanGozukara

@FurkanGozukara I am heading back home this weekend, so either this sunday, or this coming week I will work on the feedback left in this PR and work towards getting it merged. So cool to see this out in the wild already!

aimerib avatar Oct 25 '24 18:10 aimerib