SSD-Tensorflow icon indicating copy to clipboard operation
SSD-Tensorflow copied to clipboard

the loss converge so slow

Open seasonyang opened this issue 6 years ago • 9 comments

I am training my own data based on pre-trained checkpoint vgg16.chkp. My dataset is 1500 pictures,two classes contain 0:background and 1:myObject,my batch=50 why loss converge so slow?!

`INFO:tensorflow:Recording summary at step 266003.

INFO:tensorflow:global step 266010: loss = 16.8371 (1.140 sec/step)

INFO:tensorflow:global step 266020: loss = 16.5885 (1.081 sec/step)

INFO:tensorflow:global step 266030: loss = 18.6807 (1.163 sec/step)

INFO:tensorflow:global step 266040: loss = 18.6046 (1.085 sec/step)

INFO:tensorflow:global step 266050: loss = 18.5053 (1.079 sec/step)`

seasonyang avatar Aug 24 '17 02:08 seasonyang

Anybody know how to make converge quickly?

oowe avatar Sep 04 '17 07:09 oowe

I want to know please.

oowe avatar Sep 04 '17 15:09 oowe

if your own data's Characteristic not obvious? When I train the data which Characteristic is obvious it converge quickly.

And then, you can modify your batch when you training.

It would not the correct answer, but effective.

seasonyang avatar Sep 05 '17 10:09 seasonyang

@seasonyang i face the same problem, the training loss is very large, did you know why and tell me how to fix it?

congjianting avatar Sep 18 '17 03:09 congjianting

Had you solved the problem?Can you teach me how to change this model to just output two class?

CODEJY avatar Oct 19 '17 06:10 CODEJY

I face the same problem. INFO:tensorflow:global step 194010: loss = 3.4237 (0.767 sec/step) INFO:tensorflow:global step 194020: loss = 2.6979 (0.746 sec/step) INFO:tensorflow:global step 194030: loss = 2.4754 (0.759 sec/step) INFO:tensorflow:global step 194040: loss = 5.0463 (0.757 sec/step) INFO:tensorflow:Recording summary at step 194042. INFO:tensorflow:global step 194050: loss = 5.1746 (0.729 sec/step) INFO:tensorflow:global step 194060: loss = 1.6664 (0.775 sec/step) INFO:tensorflow:global step 194070: loss = 3.5295 (0.766 sec/step) INFO:tensorflow:global step 194080: loss = 2.6343 (0.764 sec/step) INFO:tensorflow:global step 194090: loss = 3.4903 (0.766 sec/step) How to solve this problem.

RoseLii avatar Oct 23 '17 01:10 RoseLii

I think you guys get good results.Many others' losses are around 40,however,they gain high mAP in evaluation.So I think it doesn't matter whether loss is high when we get good mAP in evaluation step.

Janezzliu avatar Apr 03 '18 10:04 Janezzliu

@RoseLii my loss is down to around 25 after training 100k steps.your loss is very low, do you fix the code? The matching strategy is different from original paper,but i do not know how to fix it.

davinca avatar May 02 '18 03:05 davinca

I think you guys get good results.Many others' losses are around 40,however,they gain high mAP in evaluation.So I think it doesn't matter whether loss is high when we get good mAP in evaluation step.

@Janezzliu Here comes the question, how to improve the mAP?

qianweilzh avatar Aug 24 '21 02:08 qianweilzh