stable-diffusion-webui-depthmap-script icon indicating copy to clipboard operation
stable-diffusion-webui-depthmap-script copied to clipboard

Only using 1 core when generating inpainted mesh/occluded mesh

Open 1arrcy1 opened this issue 1 year ago • 9 comments

Reproduce,

  1. use a high res image from https://replicate.com/xinntao/realesrgan
  2. generate image Knipsel2 slow

1arrcy1 avatar Jun 05 '23 12:06 1arrcy1

the inpainted mesh code is almost entirely cpu bound and is not multi threaded.

thygate avatar Jun 05 '23 14:06 thygate

the inpainted mesh code is almost entirely cpu bound and is not multi threaded.

would using a nvidia gpu (1070) be better since my amd gpu doesn't work

1arrcy1 avatar Jun 05 '23 14:06 1arrcy1

not for generating the inpainted mesh since it's almost completely cpu bound.

thygate avatar Jun 05 '23 15:06 thygate

not for generating the inpainted mesh since it's almost completely cpu bound.

Can i not do anything to optimize the workflow, it takes 4 hours for a 20mb picture to be renderd, im happy with the result but it takes a bit too long. Any advice you can give me?

1arrcy1 avatar Jun 07 '23 16:06 1arrcy1

That's why it says slooooow, .. 4hrs is a long time, 20 MB .. what dimensions are we talking here ?

Except for overclocking or getting a cpu with better single core performance, I am not aware of anything you can do to speed it up, safe from rewriting the code from the original repo, which was only slightly adapted for use in this extension.

thygate avatar Jun 07 '23 17:06 thygate

not sure why it's using the efficiency core, and not a power core ... ? I'm still on a 6th gen intel cpu, so I have no experience with this ... maybe you can select affinity in task manager or something ?

Is this a laptop ?

thygate avatar Jun 07 '23 17:06 thygate

its 3264x5824 pixels 13900k i want to do the parralex on 50 pictures, thats 200 hours if the program doesn't crash which it does :/ it would be most ideal if it would be able to use multiple cores, first part is done within 5 minutes. Is there no optimization that can be done, because it's somewhat unusable for my use case.

1arrcy1 avatar Jun 07 '23 18:06 1arrcy1

Like i said, not that i'm aware of. 4hrs does seem excessive for that size, try getting it to run on a p-core for starters ..

If this is a laptop, check the power profile.

thygate avatar Jun 07 '23 19:06 thygate

In this issue it seems like the one who posted figured an easy workaround to speed up things: swap the numpy arrays to torch tensors. https://github.com/thygate/stable-diffusion-webui-depthmap-script/issues/121

bkutasi avatar Jun 22 '23 14:06 bkutasi