diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

Marigold - monocular depth estimation pipeline

Open toshas opened this issue 1 year ago • 14 comments

Model/Pipeline/Scheduler description

Marigold depth is the current diffusion-based state-of-the-art monocular depth estimation pipeline (image-to-depth). It is derived from Stable Diffusion and fine-tuned with synthetic data. Marigold can zero-shot transfer to unseen data, offering usable results under the most challenging conditions in the wild.

This issue is the point of discussion regarding the pipeline status and future development. Recently, it has been integrated (https://github.com/huggingface/diffusers/pull/6249) as a community pipeline into diffusers.

Website Paper Hugging Face Space Hugging Face Model

Open source status

  • [X] The model implementation is available.
  • [X] The model weights are available (Only relevant if addition is not a scheduler).

Provide useful links for the implementation

Authors: @markkua @toshas @ShengyuH @nandometzger @rcdaudt @prs-eth Model weights: https://huggingface.co/Bingxin/Marigold

toshas avatar Jan 11 '24 16:01 toshas

I see this as extending diffusers in the form of a pipeline to support new tasks other than classical image generation. For example, in this case, we have "depth estimation", which is a critical computer vision problem with lots of real-world applicability.

https://github.com/prs-eth/Marigold repository has got more than 1k stars, too, which is promising.

sayakpaul avatar Jan 12 '24 01:01 sayakpaul

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Feb 11 '24 15:02 github-actions[bot]

@yiyixuxu curious to know your thoughts.

sayakpaul avatar Feb 11 '24 15:02 sayakpaul

Interesting! Community pipeline first?

yiyixuxu avatar Feb 12 '24 07:02 yiyixuxu

We already have it under community :-)

sayakpaul avatar Feb 12 '24 07:02 sayakpaul

let's keep an eye on it then!

yiyixuxu avatar Feb 12 '24 07:02 yiyixuxu

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Mar 27 '24 15:03 github-actions[bot]

We are brewing a pull request to integrate our new Marigold-LCM model and possibly a pipeline. Additional modalities (normals) are also in the making.

toshas avatar Mar 27 '24 15:03 toshas

A few questions regarding planning the pipelines growth and lifecycle.

1

The new Marigold-LCM pipeline will use the same code as the original but with a different set of defaults and expected ranges of values. We are thinking of adding a new file, called marigold_lcm_depth_estimation.py, and in there subclassing a new MarigoldLcmDepthPipeline class from the MarigoldPipeline class located currently in the marigold_depth_estimation.py file. In this new class, we would then override __call__ method with new defaults of kwargs, and add extra asserts regarding the actual passed values. Does this seem like a good solution, or are we better off making pipelines not dependent on each other?

2

I also noticed that we have made a couple of bad choices when initially committing the Marigold depth pipeline:

  • we called the depth pipeline MarigoldPipeline. It should have been MarigoldDepthPipeline.
  • we called the file with the depth pipeline marigold_depth_estimation.py, which is reflected in how the pipeline is instantiated using the from_pretrained method. We should have called the file marigold_depth_pipeline.py instead.

I want to perform these renaming actions (while also keeping in mind the LCM pipeline) to make it nice and extensible for the normals and other modalities pipelines. Would you recommend just renaming them all? Is it possible to maintain backward compatibility?

toshas avatar Mar 28 '24 17:03 toshas

Renaming will be a little bit tricky because it will not be backwards-compatible. I think it's okay as is because the first version of Marigold deals with depth anyway.

Your plan with 1 sounds rock-solid to me! Keep us posted.

sayakpaul avatar Mar 29 '24 02:03 sayakpaul

My two cents here is that marigold should be added to the core now, I like it a lot and with LCM it should be fast.

The model has almost 40k downloads in huggingface so it's very popular and can be used right out of the box with diffusers while the other depth maps preprocessors require to install controlnet_aux and sometimes gives problems because they're not part of this repo.

Personally I use it a lot, specially with diff-diff which benefits the most, and probably a controlnet trained with marigold will be a lot better than the current one, but it also works with them.

@toshas I have one suggestion, could you please upload the fp16 variants for them so people don't have to download the full precision ones if they don't use them?

Really nice work!

asomoza avatar Mar 29 '24 12:03 asomoza

Yeah agreed now. It's time for a little graduation.

@yiyixuxu let's do an issue for this so that community can pick it up!

sayakpaul avatar Mar 29 '24 12:03 sayakpaul

yes yes I created an issue here https://github.com/huggingface/diffusers/issues/7522

yiyixuxu avatar Mar 29 '24 22:03 yiyixuxu

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Apr 23 '24 15:04 github-actions[bot]