disco-diffusion icon indicating copy to clipboard operation
disco-diffusion copied to clipboard

New issue with midas function section

Open GwendalC opened this issue 2 years ago • 9 comments
trafficstars


Hello, could you help me solve this issue? I restarted the notebook several times, it was working fine up to 1 PM today A dependency issue? thanks ! (you're awesome)

in 1 #@title ### 1.4 Define Midas functions 2 ----> 3 from midas.dpt_depth import DPTDepthModel 4 from midas.midas_net import MidasNet 5 from midas.midas_net_custom import MidasNet_small

2 frames

/content/MiDaS/midas/backbones/next_vit.py in 6 from .utils import activations, forward_default, get_activation 7 ----> 8 file = open("./externals/Next_ViT/classification/nextvit.py", "r") 9 source_code = file.read().replace(" utils", " externals.Next_ViT.classification.utils") 10 exec(source_code)

FileNotFoundError: [Errno 2] No such file or directory: './externals/Next_ViT/classification/nextvit.py'

GwendalC avatar Dec 24 '22 17:12 GwendalC

It seems there has been an update of midas today.

[Dec 2022] Released MiDaS v3.1: New models based on 5 different types of transformers (BEiT, Swin2, Swin, Next-ViT, LeViT) Training datasets extended from 10 to 12, including also KITTI and NYU Depth V2 using BTS split Best model, BEiTLarge 512, with resolution 512x512, is on average about 28% more accurate than MiDaS v3.0 Integrated live depth estimation from camera feed

GwendalC avatar Dec 24 '22 18:12 GwendalC

I guess the fix should be something like referring to version 3.0 in the gitclone here in disco.py

580 | try: 581 | from midas.dpt_depth import DPTDepthModel 582 | except: 583 | if not os.path.exists('MiDaS'): 584 | gitclone("https://github.com/isl-org/MiDaS.git") 585 | if not os.path.exists('MiDaS/midas_utils.py'):

GwendalC avatar Dec 24 '22 18:12 GwendalC

same here

jszgz avatar Dec 25 '22 13:12 jszgz

Cause here: https://github.com/isl-org/MiDaS/issues/193

GwendalC avatar Dec 25 '22 17:12 GwendalC

worst crisis of my life

xirtus avatar Dec 30 '22 09:12 xirtus

As long as the fix i not merged, the branch with the fix can be used: https://colab.research.google.com/github/StMoelter/disco-diffusion/blob/fix%2Fmidas-checkout-v3-tag/Disco_Diffusion.ipynb

StMoelter avatar Dec 31 '22 14:12 StMoelter

Hi guys. MiDaS v3.1 is now fixed to make NextViT which was causing the issue optional. So you can use tag v3_1 and also use the latest models with even better performance. For instance you could point to tag v3_1, download the checkpoint from the corresponding release and then define for example the BEiT_L_384 like so:

    if midas_model_type == "beit_l_384":  # BEiT_L_384
        midas_model = DPTDepthModel(
            path=midas_model_path,
            backbone="beitl16_384",
            non_negative=True,
        )
        net_w, net_h = 384, 384
        resize_mode = "minimal"
        normalization = NormalizeImage(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])

thias15 avatar Jan 02 '23 14:01 thias15

@thias15 Thanks. It could be an interesting thing to try.

However, there's a funny thing about the depth estimation in Disco Diffusion. In many cases, more accurate depth estimation may result in aesthetically worse results.

When I initially tried using MiDaS dpt_large (from v3) alone, I found that in combination with the flow field technique used for the transformation to the next frame, the MiDaS dpt_large depth estimation was already too good. Sharp/defined edges in common content result in exposing undesirable properties of the simple flow field approach (I also had an experimental better technique prior to the DD v5 release and didn't initially include it since it was complicated and I thought people wouldn't know why I'd done it.. and then I lost the code and haven't prioritized doing it again). I quickly improved its aesthetics by introducing a weighted blend from the AdaBins output. I suspect that with the flow field approach unchanged, most would get better results by increasing the AdaBins contribution further.

aletts avatar Jan 02 '23 15:01 aletts

@aletts interesting! Note that we have introduced several new models in release v3.1 leveraging different backbones with various trade-offs between accuracy and speed, e.g. Swin-L, SwinV2-T, SwinV2-B, SwinV2-L, LeViT, BEiT-L, etc. Might be interesting to try different variants to see how nicely they play with the flow field approach. By the way, what exactly is the flow field approach used for and how does it work? On a side note, we will also release a new depth estimation model in the near future that essentially combines AdaBins and MiDaS, so stay tuned for that.

thias15 avatar Jan 04 '23 09:01 thias15