unet adjustData function - purpose of reshaping for multi-class prediction

I'm trying to understand the adjustData function in the data.py object, particularly for multi-class prediction. The conversion to one-hot makes sense. However, I don't get the purpose of the following reshape code: new_mask = np.reshape(new_mask,(new_mask.shape[0],new_mask.shape[1]*new_mask.shape[2],new_mask.shape[3])) if flag_multi_class else np.reshape(new_mask,(new_mask.shape[0]*new_mask.shape[1],new_mask.shape[2]))

Shouldn't the inputs to the model be retained as 2-D images? Why is the image new_mask.shape[1],new_mask.shape[2] being reshaped into a vector? Also, why is this applied only to mask and not to img? This reshaping isn't done for binary classification, so I'm wondering what's different about the multi-class case.

Incidentally, shouldn't the model loss defined under model.py be categorical cross-entropy, in the case of multi-class prediction? Perhaps the code isn't quite finished for multi-class usage. Anybody try running this for multi-class? Thanks in advance for any thoughts/help/clarification.

Nov 12 '18 07:11 kl-31

Hi I am trying the same - changing the code to segment images with 5 classes. As far as I understand you are right about the lossfunction, it should be 'categorical_crossentropy'. My problem is this error message: ValueError: Error when checking target: expected conv2d_24 to have 4 dimensions, but got array with shape (2, 65536, 5) It seems I got the masking wrong somehow. Did you achieve to fix your problem with the semantic segmentation of multiple classes?

Nov 18 '18 14:11 JoeyJoeJoeJunior

You can try the author's other repo unet-multi. I asked a similar question there. I wonder if there is something about the fit generator that only takes 2-D mask inputs. I haven't tried to run a multi-class prediction.

Nov 18 '18 14:11 kl-31

Yeah, I think this code can not work for multi-class case. It looks like the shape of ground truth is (batch_size, n_row*n_col, num_class), but the shape of model output (conv_10) is (batch_size, n_row, n_col). Howerver，they should be consistent, for they are y and y hat.

Dec 03 '18 13:12 JianyingLi

Same question. And is there other sophisticated U-net keras codes available? Many thanks for answer and help.

Dec 11 '18 19:12 ynwh

Anybody had luck with this? I have some solution but not sure if it'll work.

I also have been trying to work this out as I have 20 classes! The reshaping makes no sense to me. But even if I comment it out, I still get errors. Seems like something needs to be changed about the way Unet model is set up. The last layer of the model has only 1 channel and should have the shape that maches our mask, which is (768,768,20) after adjust_data in my case. num_classes = 20. 768 is the size of the image.

I changed the ending of the Unet function to the following. I change conv9 and conv10 to 20 channels to get the right amount in the end. I also changed sigmoid to softmax. I remember softmax works better with multiclass?(not sure). I also changed it to categorical_crossentropy as was suggested earlier here. This way at least I don't get errors. I'm trying to train now. Will report back on how it worked.

conv9 = Conv2D(20, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
  conv10 = Conv2D(20, 1, activation = 'softmax')(conv9)
  model = Model(input = inputs, output = conv10)
  model.compile(optimizer = Adam(lr = 1e-4), loss = 'categorical_crossentropy', metrics = ['accuracy'])

I'm trying to understand the adjustData function in the data.py object, particularly for multi-class prediction. The conversion to one-hot makes sense. However, I don't get the purpose of the following reshape code: new_mask = np.reshape(new_mask,(new_mask.shape[0],new_mask.shape[1]*new_mask.shape[2],new_mask.shape[3])) if flag_multi_class else np.reshape(new_mask,(new_mask.shape[0]*new_mask.shape[1],new_mask.shape[2]))

Shouldn't the inputs to the model be retained as 2-D images? Why is the image new_mask.shape[1],new_mask.shape[2] being reshaped into a vector? Also, why is this applied only to mask and not to img? This reshaping isn't done for binary classification, so I'm wondering what's different about the multi-class case.

Incidentally, shouldn't the model loss defined under model.py be categorical cross-entropy, in the case of multi-class prediction? Perhaps the code isn't quite finished for multi-class usage. Anybody try running this for multi-class? Thanks in advance for any thoughts/help/clarification.

Jan 15 '19 23:01 mazatov

Reporting back. It trained no problem seemingly had a low loss, but the outputs on test images were complete nonsense. All pixels in the image had label 1. Seems like it is not training the data right...

Jan 17 '19 01:01 mazatov

Finally got it to work. Here are the changes! The issue with more than one class is that you need to compare it with categorical cross-entropy. But in Keras categorical cross-entropy can only compare vectors that have been one-hot encoded. So that's why in the end of Unet I convert the final layer to the vector shape. Similarly, in adjust_data you need to convert the label to the one-hot vector shape so that categorical cross-entropy can do its job.

FYI the performance of UNet does significantly decrease with new labels. I saw noticable decreases with num_class > 5 labels, but it all depends on the problem. My objects are pretty similar and it might be causing the issue.

This is the modified Unet function:

def unet(pretrained_weights = None,input_size = (256,256,1), num_class = 2):
    inputs = Input(input_size)
    conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(inputs)
    conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
    conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool1)
    conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv2)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
    conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool2)
    conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv3)
    pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
    conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool3)
    conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv4)
    drop4 = Dropout(0.5)(conv4)
    pool4 = MaxPooling2D(pool_size=(2, 2))(drop4)

    conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool4)
    conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv5)
    drop5 = Dropout(0.5)(conv5)

    up6 = Conv2D(512, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(drop5))
    merge6 = concatenate([drop4,up6], axis = 3)
    conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge6)
    conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv6)

    up7 = Conv2D(256, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv6))
    merge7 = concatenate([conv3,up7], axis = 3)
    conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge7)
    conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv7)

    up8 = Conv2D(128, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv7))
    merge8 = concatenate([conv2,up8], axis = 3)
    conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge8)
    conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv8)

    up9 = Conv2D(64, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv8))
    merge9 = concatenate([conv1,up9], axis = 3)
    conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge9)
    conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
    conv9 = Conv2D(num_class, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
    conv10 = Conv2D(num_class, 1, activation = 'softmax')(conv9)
    conv11 = Reshape([input_size[0]*input_size[1],num_class])(conv10)

    model = Model(input = inputs, output = conv11)

    model.compile(optimizer = Adam(lr = 2e-5), loss = 'categorical_crossentropy', metrics = ['accuracy'])
    #model.compile(optimizer = Adam(lr = 1e-4), loss = 'binary_crossentropy', metrics = ['accuracy'])
    
    model.summary()

    if(pretrained_weights):
    	model.load_weights(pretrained_weights)

    return model

In data. py add

from keras.utils import to_categorical

and change part of the adjust_data function to the following

  if(flag_multi_class):
       img = img / 255
       mask = mask[:,:,:,0] if(len(mask.shape) == 4) else mask[:,:,0]    
       new_mask=np.reshape(mask,[mask.shape[0],mask.shape[1]*mask.shape[2]])
       new_mask = to_categorical(new_mask, num_classes=num_class, dtype='uint8')
       mask = new_mask

Jan 21 '19 13:01 mazatov

@mazatov Thank you so much for your idea. The problem for me now is that how to predict the RGB images with the model I have trained? I used the code provided by the author. However, the output image is like a straight line. 1552901130(1)

Can anyone give me some ideas to deal with that?

Mar 18 '19 09:03 Ekinkit

@mazatov

if(flag_multi_class):
       img = img / 255
       mask = mask[:,:,:,0] if(len(mask.shape) == 4) else mask[:,:,0]    
       new_mask=np.reshape(mask,[mask.shape[0],mask.shape[1]*mask.shape[2]])
       new_mask = to_categorical(new_mask, num_classes=num_class, dtype='uint8')
       mask = new_mask

Hello, I used the code above to process my mask image, but I got the following error:

IndexError                                Traceback (most recent call last)
<ipython-input-38-d7a7b83ff0f6> in <module>
      2 mask = img_to_array(mask)
      3 mask = mask[:,:,:,0] if(len(mask.shape) == 4) else mask[:,:,0]
----> 4 new_mask=np.reshape(mask,[mask.shape[0],mask.shape[1]*mask.shape[2]])
      5 new_mask = to_categorical(new_mask, num_classes=2, dtype='uint8')
      6 mask = new_mask

IndexError: tuple index out of range

My mask image is 3 channels. What's the matter?

now I have a piece of ultrasound data, a total of 2 categories, one is benign data, the other is malignant data, they all have corresponding mask map labels, I want to achieve multi-class detection through Unet, but I don't Knowing how to modify the model and how to process my data into multiple categories, I don't understand how my label should be defined. Can you teach me more? Thank you!

Apr 04 '19 07:04 zhouhao-learning

Looks like your mask is a tuple. In the code above, mask is a 4d array.

Apr 04 '19 10:04 mazatov

@mazatov
if(flag_multi_class):
       img = img / 255
       mask = mask[:,:,:,0] if(len(mask.shape) == 4) else mask[:,:,0]    
       new_mask=np.reshape(mask,[mask.shape[0],mask.shape[1]*mask.shape[2]])
       new_mask = to_categorical(new_mask, num_classes=num_class, dtype='uint8')
       mask = new_mask
Hello, I used the code above to process my mask image, but I got the following error:
IndexError                                Traceback (most recent call last)
<ipython-input-38-d7a7b83ff0f6> in <module>
      2 mask = img_to_array(mask)
      3 mask = mask[:,:,:,0] if(len(mask.shape) == 4) else mask[:,:,0]
----> 4 new_mask=np.reshape(mask,[mask.shape[0],mask.shape[1]*mask.shape[2]])
      5 new_mask = to_categorical(new_mask, num_classes=2, dtype='uint8')
      6 mask = new_mask

IndexError: tuple index out of range
My mask image is 3 channels. What's the matter?

now I have a piece of ultrasound data, a total of 2 categories, one is benign data, the other is malignant data, they all have corresponding mask map labels, I want to achieve multi-class detection through Unet, but I don't Knowing how to modify the model and how to process my data into multiple categories, I don't understand how my label should be defined. Can you teach me more? Thank you!

I have the same problem.

Jul 24 '19 08:07 FrancisDacian

Finally got it to work. Here are the changes! The issue with more than one class is that you need to compare it with categorical cross-entropy. But in Keras categorical cross-entropy can only compare vectors that have been one-hot encoded. So that's why in the end of Unet I convert the final layer to the vector shape. Similarly, in adjust_data you need to convert the label to the one-hot vector shape so that categorical cross-entropy can do its job.

FYI the performance of UNet does significantly decrease with new labels. I saw noticable decreases with num_class > 5 labels, but it all depends on the problem. My objects are pretty similar and it might be causing the issue.

This is the modified Unet function:

def unet(pretrained_weights = None,input_size = (256,256,1), num_class = 2):
    inputs = Input(input_size)
    conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(inputs)
    conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
    conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool1)
    conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv2)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
    conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool2)
    conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv3)
    pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
    conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool3)
    conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv4)
    drop4 = Dropout(0.5)(conv4)
    pool4 = MaxPooling2D(pool_size=(2, 2))(drop4)

    conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool4)
    conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv5)
    drop5 = Dropout(0.5)(conv5)

    up6 = Conv2D(512, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(drop5))
    merge6 = concatenate([drop4,up6], axis = 3)
    conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge6)
    conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv6)

    up7 = Conv2D(256, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv6))
    merge7 = concatenate([conv3,up7], axis = 3)
    conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge7)
    conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv7)

    up8 = Conv2D(128, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv7))
    merge8 = concatenate([conv2,up8], axis = 3)
    conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge8)
    conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv8)

    up9 = Conv2D(64, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv8))
    merge9 = concatenate([conv1,up9], axis = 3)
    conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge9)
    conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
    conv9 = Conv2D(num_class, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
    conv10 = Conv2D(num_class, 1, activation = 'softmax')(conv9)
    conv11 = Reshape([input_size[0]*input_size[1],num_class])(conv10)

    model = Model(input = inputs, output = conv11)

    model.compile(optimizer = Adam(lr = 2e-5), loss = 'categorical_crossentropy', metrics = ['accuracy'])
    #model.compile(optimizer = Adam(lr = 1e-4), loss = 'binary_crossentropy', metrics = ['accuracy'])
    
    model.summary()

    if(pretrained_weights):
    	model.load_weights(pretrained_weights)

    return model

In data. py add

from keras.utils import to_categorical

and change part of the adjust_data function to the following

  if(flag_multi_class):
       img = img / 255
       mask = mask[:,:,:,0] if(len(mask.shape) == 4) else mask[:,:,0]    
       new_mask=np.reshape(mask,[mask.shape[0],mask.shape[1]*mask.shape[2]])
       new_mask = to_categorical(new_mask, num_classes=num_class, dtype='uint8')
       mask = new_mask

Can somebody please explain what are the three dimensions of mask in this case? I just have a 2D mask with num class as the value. Is mask.shape[0] the batch size? I believe we are running it for one image and mask only at a time. Please correct me if I am wrong. Thanks a lot.

Jan 01 '21 21:01 vijay2411

Finally got it to work. Here are the changes! The issue with more than one class is that you need to compare it with categorical cross-entropy. But in Keras categorical cross-entropy can only compare vectors that have been one-hot encoded. So that's why in the end of Unet I convert the final layer to the vector shape. Similarly, in adjust_data you need to convert the label to the one-hot vector shape so that categorical cross-entropy can do its job.

FYI the performance of UNet does significantly decrease with new labels. I saw noticable decreases with num_class > 5 labels, but it all depends on the problem. My objects are pretty similar and it might be causing the issue.

This is the modified Unet function:

def unet(pretrained_weights = None,input_size = (256,256,1), num_class = 2):
    inputs = Input(input_size)
    conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(inputs)
    conv1 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
    conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool1)
    conv2 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv2)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
    conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool2)
    conv3 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv3)
    pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
    conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool3)
    conv4 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv4)
    drop4 = Dropout(0.5)(conv4)
    pool4 = MaxPooling2D(pool_size=(2, 2))(drop4)

    conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(pool4)
    conv5 = Conv2D(1024, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv5)
    drop5 = Dropout(0.5)(conv5)

    up6 = Conv2D(512, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(drop5))
    merge6 = concatenate([drop4,up6], axis = 3)
    conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge6)
    conv6 = Conv2D(512, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv6)

    up7 = Conv2D(256, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv6))
    merge7 = concatenate([conv3,up7], axis = 3)
    conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge7)
    conv7 = Conv2D(256, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv7)

    up8 = Conv2D(128, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv7))
    merge8 = concatenate([conv2,up8], axis = 3)
    conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge8)
    conv8 = Conv2D(128, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv8)

    up9 = Conv2D(64, 2, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(UpSampling2D(size = (2,2))(conv8))
    merge9 = concatenate([conv1,up9], axis = 3)
    conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(merge9)
    conv9 = Conv2D(64, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
    conv9 = Conv2D(num_class, 3, activation = 'relu', padding = 'same', kernel_initializer = 'he_normal')(conv9)
    conv10 = Conv2D(num_class, 1, activation = 'softmax')(conv9)
    conv11 = Reshape([input_size[0]*input_size[1],num_class])(conv10)

    model = Model(input = inputs, output = conv11)

    model.compile(optimizer = Adam(lr = 2e-5), loss = 'categorical_crossentropy', metrics = ['accuracy'])
    #model.compile(optimizer = Adam(lr = 1e-4), loss = 'binary_crossentropy', metrics = ['accuracy'])
    
    model.summary()

    if(pretrained_weights):
    	model.load_weights(pretrained_weights)

    return model

In data. py add

from keras.utils import to_categorical

and change part of the adjust_data function to the following

  if(flag_multi_class):
       img = img / 255
       mask = mask[:,:,:,0] if(len(mask.shape) == 4) else mask[:,:,0]    
       new_mask=np.reshape(mask,[mask.shape[0],mask.shape[1]*mask.shape[2]])
       new_mask = to_categorical(new_mask, num_classes=num_class, dtype='uint8')
       mask = new_mask

Is the reshape to (1, row*col, num_classes) necessary? Can I input (1, row, col, num_classes) into the model instead?

Mar 27 '21 15:03 jcarta

unet unet copied to clipboard

adjustData function - purpose of reshaping for multi-class prediction

unet
unet copied to clipboard