kaggle-dsbowl-2018-dataset-fixes icon indicating copy to clipboard operation
kaggle-dsbowl-2018-dataset-fixes copied to clipboard

Masks with more than one layer

Open ebouteillon opened this issue 6 years ago • 7 comments

Hi,

It is great that you shared your corrections on the kaggle images. But I found issues with some masks which have more than 1 layer.

Here is the list: Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/08151b19806eebd58e5acec7e138dbfbb1761f41a1ab9620466584ecc7d5fada/masks/48a7e471b49e92e4a9b94f949e36359513cb8002dcd6f3fccba9ecc8e0b51613.png has shape (360, 360, 4) Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/00071198d059ba7f5914a526d124d28e6d010c92466da21d4a04cd5413362552/masks/af4d6ff17fa7b41de146402e12b3bab1f1fe3c1e6f37da81a54e002168b1e7dd.png has shape (256, 256, 4) Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/e49fc2b4f1f39d481a6525225ab3f688be5c87f56884456ad54c953315efae83/masks/a5467a779ae969d13a11c96edf1d79fcb3557bf874735361b093fd8381eed9cd.png has shape (256, 320, 4) Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/7f34dfccd1bc2e2466ee3d6f74ff05821a0e5404e9cf2c9568da26b59f7afda5/masks/c2e874308f92924df63fdf62fa335bf32c4ba67d7348e2aa69bb81434ee4bfeb.png has shape (256, 320, 4) Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/3a22fe593d9606d4f137461dd6802fd3918f9fbf36f4a65292be69670365e2ca/masks/5f8e4cae79eabe8d6a1775ceda8a70ff75d6543b3164ef9013b2d7abf4c39731-2.png has shape (360, 360, 4) Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/3a22fe593d9606d4f137461dd6802fd3918f9fbf36f4a65292be69670365e2ca/masks/5f8e4cae79eabe8d6a1775ceda8a70ff75d6543b3164ef9013b2d7abf4c39731.png has shape (360, 360, 4) Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/bc115ff727e997a88f7cfe4ce817745731a6c753cb9fab6a36e7e66b415a1d3d/masks/91c1e7ee69bb7b59fa6e995d5bd38f380e4bc4153ad120acfcc38460be68ac48.png has shape (360, 360, 4) Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/b1eb0123fe2d8c825694b193efb7b923d95effac9558ee4eaf3116374c2c94fe/masks/0f834446a35541208d20b4c26cc19dbb4af2176611837e4fb8570c7e94353291.png has shape (360, 360, 4) Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/b1eb0123fe2d8c825694b193efb7b923d95effac9558ee4eaf3116374c2c94fe/masks/9790898c3892ac0b92a08b9f878f344333023374b7464ee571b0010b98dacc51.png has shape (360, 360, 4) Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/b1eb0123fe2d8c825694b193efb7b923d95effac9558ee4eaf3116374c2c94fe/masks/337f2de889525efaf003b6d2422c25df28d83903c1ed34fb56f7a10f6e442517.png has shape (360, 360, 4) Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/b1eb0123fe2d8c825694b193efb7b923d95effac9558ee4eaf3116374c2c94fe/masks/78a94c505bcf4589ae5485ec1857880bca42db85b6ff8c8d67832fa45f3fbaa4.png has shape (360, 360, 4) Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/a0afead3b4fe393f6a6159de040ecb2e66f8a89090abf0d0bf5b8e1d38ae667c/masks/52a0da01e7292a55903c626bad32cb224d74013aed8dcee98b2b5c2ff0d8adc0-2.png has shape (360, 360, 4) Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/a0afead3b4fe393f6a6159de040ecb2e66f8a89090abf0d0bf5b8e1d38ae667c/masks/52a0da01e7292a55903c626bad32cb224d74013aed8dcee98b2b5c2ff0d8adc0.png has shape (360, 360, 4) Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/08275a5b1c2dfcd739e8c4888a5ee2d29f83eccfa75185404ced1dc0866ea992/masks/e412f045419f9fccbf9678a5970c3f2badc0079167b0b7eb02a3fce5dad82db9.png has shape (1024, 1024, 4) Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/0121d6759c5adb290c8e828fc882f37dfaf3663ec885c663859948c154a443ed/masks/e84dfd6c501746b19fe394531946dfe336204d104ec493a2114e00b294dc6b6b.png has shape (256, 320, 4) Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/0121d6759c5adb290c8e828fc882f37dfaf3663ec885c663859948c154a443ed/masks/5cc96f8fbbe571aa3e1d62a577d61b1397d2226d961058554ab29e8a473397c2.png has shape (256, 320, 4) Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/a0de55384fada5cbc46bd7a41f6feeef93b67d088497c7316079ccec39c2a834/masks/0cc0c8466c81b6e2a26d2779ba8be57616a2335dc879220e0eb7b7870588d78a.png has shape (256, 256, 4) Warning: ./kaggle-dsbowl-2018-dataset-fixes/stage1_train/648636ee314d7bdba3ab2fc0fe49a863de35c3e2caf619039f678df67b526868/masks/61d68aff301c235ee2c0c706b1ad198130114e9a499d279d47ffe947cd8ff240.png has shape (256, 256, 4)

ebouteillon avatar Mar 15 '18 21:03 ebouteillon

Hi @ebouteillon You can take the top layer of the mask (mask[:,:,0]), layers (0,1,2) has max pixel set for the mask.

therahulkumar avatar Mar 18 '18 08:03 therahulkumar

Hi @princerk , It is already what I am doing ;) It is just a notice that these masks are not consistent with others (and the originals by the way). Regards

ebouteillon avatar Mar 18 '18 09:03 ebouteillon

@ebouteillon : How do you solve it? I am run the code to classify the dataset into 2 cluster but I got the error about masks mismatch dimension of these above ids. This is what I am doing


# Get image width, height and count masks available.
def read_image_labels(image_id, space="rgb"):
    image = read_image(image_id, space=space)
    mask_file = STAGE1_TRAIN_MASK_PATTERN.format(image_id)
    print (mask_file)
    
    masks = skimage.io.imread_collection(mask_file).concatenate()
    height, width, _ = image.shape
    num_masks = masks.shape[0]
    labels = np.zeros((height, width), np.uint16)
    for index in range(0, num_masks):
        labels[masks[index] > 0] = 255  # index + 1
    return image, labels, num_masks

John1231983 avatar Mar 18 '18 13:03 John1231983

Instead of using masks[index] directly, I'm using an intermediate object. Let's call it m, then it would give you something like:

m = masks[index]
if(len(m.shape) > 2):
    m = m[..., 0]

You may have to use np.squeeze too if the extra dimension annoys you.

ebouteillon avatar Mar 19 '18 14:03 ebouteillon

Thanks for your solution. But the error come from the line

 masks = skimage.io.imread_collection(mask_file).concatenate()

John1231983 avatar Mar 19 '18 14:03 John1231983

I'm using imread and not imread_collection from skimage. That's probably why I don't have this problem. Two solutions:

  • Alter the images directly (and do a pull request :) ).
  • Don't use concatenate() and perform concatenation yourself after using the trick above

ebouteillon avatar Mar 19 '18 14:03 ebouteillon

Hi @John1231983 , you can apply some small changes to your code simply defining a custom reading method and passing it to the ImageCollection class constructor. I've fixed it this way:

def _imread_mask(f):
    mask = skimage.io.imread(f)
    if len(mask.shape) > 2:
        mask = mask[:,:,0]
    return mask

...   

masks = skimage.io \
    .ImageCollection(masks_path, load_func = _imread_mask) \
    .concatenate()

yswe avatar Mar 20 '18 17:03 yswe