ViTMatte icon indicating copy to clipboard operation
ViTMatte copied to clipboard

Softmax outputs?

Open rb-synth opened this issue 2 years ago • 7 comments

I have an image with multiple objects and background. Is there anyway to produce mattes such that the sum in any given pixel is equal to one? In other words, to consider the objects at the same time rather than individually? When they are considered independently, I sometimes end up with a blank region between two touching objects, which gives the impression that there is background between the two objects even though I know this is not the case.

Any ideas what to do here?

rb-synth avatar Oct 20 '23 10:10 rb-synth

For example, I take an image, get masks (with SAM) and get mattes. Then I visualise alpha_1 + alpha_2 != 1:

Image

cats-and-dogs

masks

mask_18 mask_27

mattes

alpha_18 alpha_27

areas where sum != 1

diff

rb-synth avatar Oct 20 '23 11:10 rb-synth

For me, I may try to generate only one trimap for both of them instead of two separate ones. You can achieve this easily by our matte-anything.

JingfengYao avatar Nov 14 '23 19:11 JingfengYao

Hi, matte-anything appears to be segment anything, followed by ViTMatte, so how would that be different from this example? All the matte-anything examples give binary masks, but I need multi-instance matting – is this possible with matte-anything?

rb-synth avatar Nov 15 '23 16:11 rb-synth

To be clear, in this toy example I couldn't create just one trimap since I have three classes – dog, cat, background.

rb-synth avatar Nov 15 '23 16:11 rb-synth

Do you mean something like this? 1700067377392 1700067439937

JingfengYao avatar Nov 15 '23 16:11 JingfengYao

No, this is still binary. It's either:

  1. FG: cat A, BG: everything else,
  2. FG: cats A and B, BG: everything else, or
  3. FG: cat B, BG: everything else.

I would want it to matte both of the cats independently, but in such a way that at the border between the two cats the sum of the mattes == 1.

rb-synth avatar Nov 16 '23 07:11 rb-synth

I see. Interesting perspective. However, it seems difficult to the matting models like ViTMatte. Since the training framework is different. From my own perspective, it is also difficult to say which alpha (for example 0.5 or 0.6)is the absolute correct for the edges of the object.

JingfengYao avatar Nov 18 '23 14:11 JingfengYao