yolov7 icon indicating copy to clipboard operation
yolov7 copied to clipboard

Error While Training with Hyperparameter Evolution

Open Galafala opened this issue 2 years ago • 13 comments

Hello,

I was trying to train a model with hyperparameter evolution, but the error occured. Did I do any thing wrong?

Below was the error I met and the command I used in terminal.

Screen Shot 2022-07-28 at 20 39 11

Screen Shot 2022-07-28 at 20 39 33

Galafala avatar Jul 28 '22 12:07 Galafala

I just sent a pull request to fix this error. The problem is that the 'anchor' parameter is missing from your hyp.yaml file you can add it manually : anchors:3 #3 is a good default value Also you'll run to another error because the train.py file is also missing the 'paste_in' and 'copy_paste' values from the metadata variable called 'meta' line 614

Dhiaeddine-Oussayed avatar Jul 28 '22 12:07 Dhiaeddine-Oussayed

@Dhiaeddine-Oussayed thanks for your helping. I added anchor in the hyp.yaml and the problem solved. However, I met a new error. The error is at below. How can I solve it?

Screen Shot 2022-07-28 at 21 01 15

Galafala avatar Jul 28 '22 13:07 Galafala

@Galafala did you add the 'paste_in' and 'copy_paste' to the train.py ?

Dhiaeddine-Oussayed avatar Jul 28 '22 13:07 Dhiaeddine-Oussayed

@Galafala if you didn't you can just replace this in your train.py at line 614:

meta = {'lr0': (1, 1e-5, 1e-1),  # initial learning rate (SGD=1E-2, Adam=1E-3)
                'lrf': (1, 0.01, 1.0),  # final OneCycleLR learning rate (lr0 * lrf)
                'momentum': (0.3, 0.6, 0.98),  # SGD momentum/Adam beta1
                'weight_decay': (1, 0.0, 0.001),  # optimizer weight decay
                'warmup_epochs': (1, 0.0, 5.0),  # warmup epochs (fractions ok)
                'warmup_momentum': (1, 0.0, 0.95),  # warmup initial momentum
                'warmup_bias_lr': (1, 0.0, 0.2),  # warmup initial bias lr
                'box': (1, 0.02, 0.2),  # box loss gain
                'cls': (1, 0.2, 4.0),  # cls loss gain
                'cls_pw': (1, 0.5, 2.0),  # cls BCELoss positive_weight
                'obj': (1, 0.2, 4.0),  # obj loss gain (scale with pixels)
                'obj_pw': (1, 0.5, 2.0),  # obj BCELoss positive_weight
                'iou_t': (0, 0.1, 0.7),  # IoU training threshold
                'anchor_t': (1, 2.0, 8.0),  # anchor-multiple threshold
                'anchors': (2, 2.0, 10.0),  # anchors per output grid (0 to ignore)
                'fl_gamma': (0, 0.0, 2.0),  # focal loss gamma (efficientDet default gamma=1.5)
                'hsv_h': (1, 0.0, 0.1),  # image HSV-Hue augmentation (fraction)
                'hsv_s': (1, 0.0, 0.9),  # image HSV-Saturation augmentation (fraction)
                'hsv_v': (1, 0.0, 0.9),  # image HSV-Value augmentation (fraction)
                'degrees': (1, 0.0, 45.0),  # image rotation (+/- deg)
                'translate': (1, 0.0, 0.9),  # image translation (+/- fraction)
                'scale': (1, 0.0, 0.9),  # image scale (+/- gain)
                'shear': (1, 0.0, 10.0),  # image shear (+/- deg)
                'perspective': (0, 0.0, 0.001),  # image perspective (+/- fraction), range 0-0.001
                'flipud': (1, 0.0, 1.0),  # image flip up-down (probability)
                'fliplr': (0, 0.0, 1.0),  # image flip left-right (probability)
                'mosaic': (1, 0.0, 1.0),  # image mixup (probability)
                'mixup': (1, 0.0, 1.0),   # image mixup (probability)
                'copy_paste': (1, 0.0, 1.0),  # segment copy-paste (probability)
                'paste_in': (1, 0.0, 1.0)}    # segment copy-paste (probability)

Dhiaeddine-Oussayed avatar Jul 28 '22 13:07 Dhiaeddine-Oussayed

@Dhiaeddine-Oussayed thanks for your response. The promblem still exist. I noticed that the problem raised after the anchors=3 was added to the hyp.yaml. And the problem happened at both without evolution training and evolution training.

Galafala avatar Jul 28 '22 13:07 Galafala

Yes, i did. Below was the hpy.yaml I used.

Screen Shot 2022-07-28 at 21 58 18

Galafala avatar Jul 28 '22 13:07 Galafala

can you share your train.py file with me please

Dhiaeddine-Oussayed avatar Jul 28 '22 14:07 Dhiaeddine-Oussayed

@Dhiaeddine-Oussayed This is the zip file of my train_azu.py. Should I past the code or the zip file is fine?

train_aux.py.zip

Galafala avatar Jul 28 '22 14:07 Galafala

@Galafala My pull request have been merged try pulling master and test again

Dhiaeddine-Oussayed avatar Jul 28 '22 19:07 Dhiaeddine-Oussayed

@Dhiaeddine-Oussayed , I have pull the latest file and trained a mdoel with evolution. The train.py is working, but train_aux.py still raised errors. Below are the errors.

Screen Shot 2022-07-29 at 03 26 00

Galafala avatar Jul 28 '22 19:07 Galafala

did you chnage the "nc" : number of classes in your config file ?

Dhiaeddine-Oussayed avatar Jul 29 '22 09:07 Dhiaeddine-Oussayed

@Dhiaeddine-Oussayed Yes, I did.

Screen Shot 2022-07-29 at 22 25 59

Galafala avatar Jul 29 '22 14:07 Galafala

the len of parameters in train.py-->mata should equal to the len of parameters in hpy.scatch.p5.yaml

qwtoe avatar Aug 03 '22 11:08 qwtoe

@Dhiaeddine-Oussayed Yes, I did.

Screen Shot 2022-07-29 at 22 25 59

Hi, I met exactly the same problem. Have you solved this problem? Thanks!!

gavin-usyd avatar Aug 12 '22 09:08 gavin-usyd