mscnn icon indicating copy to clipboard operation
mscnn copied to clipboard

Code does not converge

Open lilong-epfl opened this issue 6 years ago • 5 comments

Dear Ling,

Thank your very much for providing this open source. But I encountered some problems to run your code.

image which happens when you delete the forth dimension by using the function image_out = tf.squeeze(con_out, 3) in the file mscnn.py

image tmp1 will be None value in the second iteration since it cannot be assigned to other value by the while loop. This part code is not available.

image After I make some correction to your code, finally I can run the training code. But after hours of training, there is no sign the training goes to converge.

Is that possible to provide your pre-trained model to reproduce the result in the paper?

Best wishes, Long LI

lilong-epfl avatar May 15 '18 08:05 lilong-epfl

Thanks for your correction, I have corrected some problems. However, this project also have some fault that I don't have time to modify.

Not converge problem suggestion: set large learning rate, such as 1e-1

Notes Since I not resize datasets' image to same size, so just set batch_size param is 1.

Ling-Bao avatar May 16 '18 04:05 Ling-Bao

Thank you very much for your reply and correction. Now the training runs perfectly. But the evaluate get an issue: image which cause by the following code: image I believe this is a small issue. But it will be very nice if your can also correct.

Thanks in advance! Best wishes, Long LI

lilong-epfl avatar May 16 '18 12:05 lilong-epfl

The shape of mscnn model output is [batch_size, w, h, c] and c = 1. So you can use tf.squeeze(predict, 3) to solve this.

predict =  tf.squeeze(predict, 3) 
l2_loss = tf.reduce_sum((predict - label) * (predict - label))

Ling-Bao avatar May 18 '18 15:05 Ling-Bao

When running this model with my own data i came across the same error:

Incompatible shapes: [1,272,480] vs. [1,1088,1920] [[{{node sub}} = Sub[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Squeeze, _arg_Placeholder_1_0_1/_111)]]

I have done this: predict = tf.squeeze(predict, 3) l2_loss = tf.reduce_sum((predict - label) * (predict - label)) But still it gives the same error

sratandeep16 avatar Feb 19 '19 06:02 sratandeep16

@sratandeep16 hi,i miss the same problem, do you resolve?

Blesszd avatar Sep 16 '19 12:09 Blesszd