PAIR-Diffusion
PAIR-Diffusion copied to clipboard
How to use customized segmentation mask for test?
I want to replace the OneFormer used in your demo with some other segmentation models (for instance, SAM), now suppose I have
input_img: H1 * W1 * 3, uint8
input_mask H1 * W1, binary 0,1
ref_img: H2 * W2 * 3, uint8
ref_mask: H2 * W2 binary 0,1
How should I modify the demo code to get correct structure and appearance guidance features? Thanks for your help in advance!
Also,what's the expected dtype, shape and data range for img
and mask
here? I suppose the img should be torch.float32
and [-1,1]
, but what about the mask input? Can I directly use a binary 0, 1 mask?
Yes the image should be torch.float32 and [-1,1] and mask should be of type int(), so it can be binary as well
@Lotayou btw, if I understand correctly, to run this with a different segmentation model you are not only going to need a binary mask, but also the panoptic and semantic segmentation masks If I understand correctly, you pass to get_appearance: {the input image and the panoptic mask (which isn't binary)} and {the reference image and the reference mask (which is binary)} , but please correct me @vidit98 if wrong
Yes, that is correct and if a different segmentation model is used the class mapping should match that of the trained model else a new model has to be trained. The pipeline @CesarERamosMedina is talking about is for the demo, however, I see the picture above from the model definition.
I actually meant from the model itself. I think the code in your gradio app makes two calls to that function, one with a binary mask and one with a non-binary mask