hover_net icon indicating copy to clipboard operation
hover_net copied to clipboard

CONSEP - Getting dice_0=0 Classification during training

Open caprioGirl opened this issue 4 years ago • 6 comments

when training with CoNSeP i am getting 0 at type 1 dice after nearly every 20-23 epochs of stage 2 training. Screenshot from 2021-11-23 15-59-53 This behavior also happens with the PanNuke Dataset. I have used the same settings as described in the paper i.e. LR=1e-4 for first 25 epochs and 1e-5 for the last 25 of both the stages, the same parameters as described in the paper, but still encountering these 0 values when training for classification of consep and pannuke.

caprioGirl avatar Nov 23 '21 11:11 caprioGirl

Hi, I am sorry for the late reply. The above can happen when the number of instances of type 1 for training is small. I think it should be the Miscellaneous type. You should try to split your training set (or double-check it) so that it contains more of that type.

vqdang avatar Nov 29 '21 11:11 vqdang

yes i figured much, that it's because of the miscellaneous type. I am using whole of CoNSeP training data just like in the paper, so I am confused why is it going 0? am i doing something wrong? did you guys split the training data w.r.t the misc type to get the results? Actually, i am confused about, why go to 0, okay a low score is possible but why 0? When it doesn't do such a behavior in the paper? is there any specific setting that i am missing? how did you guys cater this in the paper?

caprioGirl avatar Nov 29 '21 12:11 caprioGirl

Miscellaneous is a very noisy class. Sometimes you will get some unexpected behaviour. Do you have the dice over time during training? You should have this in your tensorboard output.

simongraham avatar Dec 07 '21 16:12 simongraham

@simongraham Here is the misc graph from tensorboard. Screenshot from 2021-12-08 14-00-47

caprioGirl avatar Dec 08 '21 09:12 caprioGirl

am i missing something in the classification pipeline?? @simongraham More insight onto the training details are as follows:

  1. extract patches: 540x540_80x80, all of train, and test as validation, original mode, act_shape = [270, 270], nr_type = 5
  2. stage 1: epochs=50, msge weight = 2 , all others are 1. Lr =1e-4 and decays by 0.1 every 25 epochs. The "pretrained" model= ImageNet-ResNet50-Preact_pytorch.tar. batch_size 8
  3. stage2: epochs=50, msge weight = 2, all the others are 1. Lr=1e-4 and decays by 0.1 every 25 epochs. batch_size is 4

caprioGirl avatar Jan 27 '22 10:01 caprioGirl

Hello! I have met a similarly question with you. like flow: ------valid-np_acc : 0.92795 ------valid-np_dice : 0.73184 ------valid-tp_dice_0 : 0.94518 ------valid-tp_dice_1 : 0.02475 ------valid-tp_dice_2 : 0.49475 ------valid-tp_dice_3 : 0.00000 ------valid-tp_dice_4 : 0.61169 ------valid-hv_mse : 0.06671 the dice_3 is always 0. I use the Consep dataset,and set the nr_types are 5.

hjtalent2023 avatar May 15 '23 11:05 hjtalent2023