Yet-Another-EfficientDet-Pytorch icon indicating copy to clipboard operation
Yet-Another-EfficientDet-Pytorch copied to clipboard

low map, cant train a good result in my datasets

Open angryhen opened this issue 4 years ago • 16 comments

I trained my datasets with 2 classes. First, I trained it withhead only, lr=1e-3, and then I trained the whole model, lr=1e-5. Val's class_loss is about 0.2, and reg_loss is 0.03, but my map results are as follows

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.019
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.040
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.016
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.011
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.044
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.024
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.076
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.131
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.089
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.156

please tell me how to do

angryhen avatar Apr 24 '20 18:04 angryhen

update code author have changed loss.py

kiddliao avatar Apr 24 '20 19:04 kiddliao

update code author have changed loss.py

I git clone this project in 4.23, It's the latest

angryhen avatar Apr 24 '20 19:04 angryhen

can you share loss graph? I think 1e-3 is too big for simple dataset like that. It is much easy to see if there is a overfitting on loss graph

zylo117 avatar Apr 25 '20 05:04 zylo117

可以分享损失图吗? 我认为1e-3对于像这样的简单数据集来说太大了。 很容易看出损耗图是否过拟合

hi , I've tried it several times. Here are my actions and results

1. python train.py -c 0 -p custom--lr 1e-3 --batch_size 32 --load_weights weights/efficientdet-d0.pth  --num_epochs 200 --head_only True
Val. Epoch: 110/200. Classification loss: 0.59728. Regression loss: 0.05891. Total loss: 0.65619

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.001
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.008
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.041
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.033
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.046

2. python train.py -c 0 -p custom --lr 1e-5 --batch_size 16 --load_weights logs/custom/efficientdet-d0_107_3000.pth --num_epochs 1000
Val. Epoch: 136/1000. Classification loss: 0.20790. Regression loss: 0.04673. Total loss: 0.25463

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.001
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.005
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.001
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.001
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.004
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.016
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.041
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.037

1

2

3

angryhen avatar Apr 25 '20 12:04 angryhen

it's overfitting. totally. from the very beginning

zylo117 avatar Apr 25 '20 14:04 zylo117

I've been training for a short time so can't believe it's over fitting. Do you have any suggestions?

angryhen avatar Apr 25 '20 14:04 angryhen

reduce lr

zylo117 avatar Apr 25 '20 14:04 zylo117

减少lr

but the LR dropped to 1e-8 at the end of training

angryhen avatar Apr 25 '20 14:04 angryhen

But it had been overfitting from the very begining. It's hardly impossible to save the weights from overfitting once it happens.

It's like that you fly to another country and you find out you have gone too far from your destination then you try to walk back where you really want to be. Imagine that.

zylo117 avatar Apr 25 '20 15:04 zylo117

减少lr

but the LR dropped to 1e-8 at the end of training

have you fixed your problem?

YanqingWu avatar Apr 28 '20 10:04 YanqingWu

减少lr

but the LR dropped to 1e-8 at the end of training

have you fixed your problem?

I think it's hard to train small object, but I got a normal result in normal anchor size datasets

angryhen avatar Apr 28 '20 10:04 angryhen

减少lr

but the LR dropped to 1e-8 at the end of training

have you fixed your problem?

I think it's hard to train small object, but I got a normal result in normal anchor size datasets

In case of possible need to modify your anchor, please try this repo: https://github.com/Cli98/anchor_computation_tool

Cli98 avatar May 09 '20 23:05 Cli98

it's overfitting. totally. from the very beginn

Could you tell me how to determine whether overfitting please? cuz it seems well for me

czzbb avatar May 18 '20 13:05 czzbb

@czzbb there is no certain standard for overfitting.

zylo117 avatar May 18 '20 15:05 zylo117

I want to know why this curve means overfitting

---Original--- From: "zylo117"<[email protected]> Date: Mon, May 18, 2020 23:59 PM To: "zylo117/Yet-Another-EfficientDet-Pytorch"<[email protected]>; Cc: "Mention"<[email protected]>;"czzbb"<[email protected]>; Subject: Re: [zylo117/Yet-Another-EfficientDet-Pytorch] low map, cant train a good result in my datasets (#203)

@czzbb there is no certain standard for overfitting.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

czzbb avatar May 18 '20 16:05 czzbb

@czzbb, your training loss is still decreasing while your validation loss is still on a plateau. It means that your model adapts the weights too much for the training set while not generalising well because it is not improving for unseen data. A good strategy is to stop the training based on the validation loss.

heitorrapela avatar Oct 07 '21 21:10 heitorrapela