pix2pixHD icon indicating copy to clipboard operation
pix2pixHD copied to clipboard

Clarification question: Instance mapping appears to be missing

Open hoomanNo5 opened this issue 7 years ago • 15 comments

Hi,

I'm trying to understand how instance boundary maps are used by your code to improve the synthesized output of the ign-G.

This excerpt is from section 3.3 of your paper and is very clear. I agree with it as well.

"Instead, we argue that the most important information the instance map provides, which is not available in the semantic label map, is the object boundary. For example, when a number of same-class objects are next to one another, looking at the semantic label map alone cannot tell them apart. This is especially true for the street scene since many parked cars or walking pedestrians are often next to one another, as shown in Fig. 3a. However, with the instance map, separating these objects becomes an easier task."

I was able to successfully run your code and synthesize output as expected. However, I am confused when I look deeper into the inputs of the examples provided to the ign-G.

The instance boundary maps (found in ./datasets/cityscapes/test_inst) don't appear to provide boundary information. For example, frankfurt_000001_047552_gtFine_instanceIds.png (below) doesn't define boundaries of the vehicles parked on either side of the street.

image

In other examples, boundary mapping appears but only for a small part of the image (frankfurt_000001_054640_gtFine_InstanceIds.png). In the below image, the red box shows boundary mapping but not consistently throughout the image.

image

I used GIMP to inspect the hex color codes to make sure there is no tiny variation that my eyes cannot detect. I used this technique to inspect the label map which contains overt and subtle color labeling distinctions.

Is this because that file is not an instance boundary map? If so, is this file the concatenation of the one hot vector representation of the semantic label map and the boundary map? If not, is it the channel wise concat of the instance boundary map, semantic label map, and the real image?

They are "_labelIds" and "_instanceIds" and this is why I am confused.

Please help me clear up my confusion. Because otherwise, wouldn't the ign-G considers these groups of cars and people to be a single object during synthesis?

Thank you for sharing your hard work. I really am enjoying experimenting with it.

hoomanNo5 avatar Jan 16 '18 01:01 hoomanNo5

The instance IDs for different cars should be different. Please note that the IDs are greater than 256 (e.g. 26001, 26002, ...), so please keep that in mind when reading in the image.

tcwang0509 avatar Jan 16 '18 18:01 tcwang0509

Thanks for responding. Can you tell me where in the image you are defining the IDs? I cannot locate any values that match your description.

hoomanNo5 avatar Jan 18 '18 21:01 hoomanNo5

The IDs are in *_InstanceIds.png. The way Cityscapes encodes the instance IDs is as follows: For classes that have instances labeled (e.g. cars and pedestrians), the instance ID divided by 1000 is the class ID, while the remainder when divided by 1000 is the ID for the particular instance. For example, 26003 means it's a car (label 26) and of ID 003. For classes that don't have instance labeled (e.g. trees), the instance ID is just the same as the label ID.

tcwang0509 avatar Jan 20 '18 01:01 tcwang0509

I appreciate the feedback, but I don't see any 4 or 5 digit label info embedded in the pngs anywhere. I am not able to get the cityscapes dataset for comparison so I'm only looking at the images included in the /datasets/cityscapes/*inst folders. They look like the images I posted above. Those colors don't match your example format. GIMP tells me they are hex 656565 and pixel value 101.

image

I don't know where in your code or what tool you use to inspect and find those instance IDs, but I'm pretty sure I'm not seeing what you're seeing.

I tried using my own images to test the netG checkpoint, but I can't segment and label my images according to how your code expects them. Perhaps I could build a segnet into the front of your code so that future users don't have to worry about this problem and can just provide raw images for testing and training?

hoomanNo5 avatar Jan 26 '18 23:01 hoomanNo5

I am trying this with my own dataset and am also finding the prep needed and necessary encoding to be unclear. A segnet in front would be absolutely fabulous!

codeisnotcode avatar Jan 29 '18 07:01 codeisnotcode

Hi @tcwang0509, thanks for all the hard work, very impressive. I think the confusion is created by the mismatched data between the paper and the Github example. What format should we use? 01 Thanks

aviel08 avatar May 26 '18 11:05 aviel08

@codeisnotcode do you have instance maps with your own dataset? If not, you can just specify '--no_instance'. @aviel08 the colorful label maps are just for representation. When feeding to the network, you need to have images similar to the github files.

tcwang0509 avatar Jun 28 '18 21:06 tcwang0509

@hoomanNo5 The cityscapes instanceIds images are encoded by InstanceID. In short, firstly, different class of objects have different pixel values, besides, different entities of the same class also have different pixel values. Therefore, according to the article, you can write your own code fllow the article principle to generate boundray map. I simply tried it, result is fllows:

InstanceIds.png aachen_000000_000019_gtfine_instanceids

the pixel value range screenshot from 2018-07-01 10-31-01

boundary map debug

ZzzjzzZ avatar Jul 01 '18 02:07 ZzzjzzZ

@CarryJzzZ Hi,I think your boundary map very good! Can I ask how you do this,using which kind of algorithm.Bacause I use many edge detection algorithms even can‘t do this effect.I will be very thankful if you can share it!

594cp avatar Jul 24 '18 13:07 594cp

@CarryJzzZ Thank you for making it clear. Just one question: How do you read the instance? I read it using opencv as usual and my maximum value is 127. I believe I am reading it not correctly, so could you please share how did you get the max pixel_value 33001?

doantientai avatar Sep 14 '18 09:09 doantientai

Finally I found it! @hoomanNo5 You just need to read the image using OpenCV with option cv2.IMREAD_UNCHANGED or cv2.IMREAD_ANYDEPTH. Like this:

inst = cv2.imread(join(path_inst, inst_name), cv2.IMREAD_ANYDEPTH)

And you will see the true value of the pixels. Otherwise it will compress the value to int [0-255] and obviously you will see all cars have the same pixel value.

I have been spending my whole week to generate the instance map and couldn't understand why it didn't work. It turns out that I didn't correctly understand how the instance map was created.

doantientai avatar Sep 14 '18 10:09 doantientai

@594cp @doantientai I would like to use pillow to do image processing. For this issue, I just follow the method used by this paper, maybe it is simple with some bugs, you can debug it, and find the best way to get with it.

def boundary(raw_input, save_path, save_name):
    """
    calculate boundary mask & save
    :param raw_input: *instanceIds image
    :param save_path: city name
    :param save_name: boundary mask name
    :return:
    """
    # process instance mask
    instance_mask = Image.open(raw_input)
    width = instance_mask.size[0]
    height = instance_mask.size[1]
    mask_array = np.array(instance_mask)

    # define the boundary mask
    boundary_mask = np.zeros((height, width), dtype=np.uint8)  # 0-255

    # perform boundary calculate: the center pixel_id is differ from the 4-nearest pixels_id
    for i in range(1, height-1):
        for j in range(1, width-1):
            if mask_array[i, j] != mask_array[i - 1, j] \
                    or mask_array[i, j] != mask_array[i + 1, j] \
                    or mask_array[i, j] != mask_array[i, j - 1] \
                    or mask_array[i, j] != mask_array[i, j + 1]:
                boundary_mask[i, j] = 255
    boundary_image = Image.fromarray(np.uint8(boundary_mask))
    # boundary_image.show()
boundary_image.save(os.path.join(RAW_INPUT_PATH, save_path, save_name))

ZzzjzzZ avatar Sep 14 '18 10:09 ZzzjzzZ

@CarryJzzZ Thank you, that is really helpful!

doantientai avatar Sep 14 '18 10:09 doantientai

Hi, I am wondering how can I get the instanceId images for my own datasets.

zhangdan8962 avatar Jan 17 '20 19:01 zhangdan8962

@zhangdan8962 have you found the answer? I have my own dataset and I created one channel instace maps but it gives me error, what is the shape of the instance maps? the same as the input?

Rubiel1 avatar Mar 21 '20 15:03 Rubiel1