yolov7 Very poor result on object crowd, same as yolov4

Very poor result on object crowd, same as yolov4

Open ggenny opened this issue 1 year ago • 10 comments

I am evaluating the performance on different image ( people crowd ), the detection abilities on people are significatly lower than v3, as an example this image ( using web demo ):

[YOLOV3] ./darknet detector test cfg/coco.data ./cfg/yolov3.cfg ./yolov3.weights

YOLOV7 [ Web Demo, same as yolov4 ] yolov7

Original image: 144037989-efaa1de8-e24d-488f-99c5-f9d8fd69bed5 (1)

is a neck definition problem ?

Aug 05 '22 02:08 ggenny

It mainly because we use iou as target of objectness, you could reduce self.gr to avoid this issue.

By the way, my testing result of yolov7 is as below.

Aug 05 '22 03:08 WongKinYiu

can you draw the box thinner so we can have a quick comparison between yolov3 and v7 @WongKinYiu ? As I can see, yolov7 is still worse than v3 for this case.

Aug 05 '22 03:08 trungpham2606

Yes, to handle this case, the model need to be retrain by above mentioned suggestion.

Aug 05 '22 04:08 WongKinYiu

@WongKinYiu should we need to change this parameter from

hyp.scratch.yml

obj: 0.7 # obj loss gain (scale with pixels) obj_pw: 1.0 # obj BCELoss positive_weight iou_t: 0.20 # IoU training threshold

Aug 05 '22 04:08 akashAD98

No, self.gr is set in train.py.

Aug 05 '22 05:08 WongKinYiu

okay i have another q, sorry its noob question

hyp['cls'] *= nc / 80. * 3. / nl # scale to classes and layers

nc=no of classes nl = model.model[-1].nl # number of detection layers (used for scaling hyp['obj'])

what is 80 mentioned for? (its coco 80 classes) should we need to change this no for custom trained model having different class no ?

Aug 05 '22 05:08 akashAD98

Yes, coco 80 classes. You do not need change it for custom dataset.

Aug 05 '22 05:08 WongKinYiu

Try training your model on crowd dataset

Aug 05 '22 06:08 spacewalk01

Yes, coco 80 classes. You do not need change it for custom dataset.

For the custom dataset, I use yolov7 to train the object detect model. What I need to do is just the following 3 steps ?

Create the data/custom.yaml file, eg:

# COCO 2017 dataset http://cocodataset.org - first 128 training images
# Train command: python train.py --data coco128.yaml
# Default dataset location is next to /yolov5:
#   /parent_folder
#     /coco128
#     /yolov5


# download command/URL (optional)
#download: https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip

# train and val data as 1) directory: path/images/, 2) file: path/images.txt, or 3) list: [path1/images/, path2/images/]
train: /home/epbox/AI/dataset/pf/pddModelByFlash/v0.0.3/images/train/ 
val: /home/epbox/AI/dataset/pf/pddModelByFlash/v0.0.3/images/val/
test: /home/epbox/AI/dataset/pf/pddModelByFlash/v0.0.3/images/val/

# number of classes
nc: 6

# class names
names: ['scratch','crack','leakage','membrane','wiredscreen','blurredscreen']

Change the nc: 80 # number of classes in cfg/training/yolov7.yaml to nc: 6 # number of classes
Choose train mode

From scratch with p5

python train.py --workers 8 --device 0 --batch-size 32 --epochs 100 --data data/custom.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '' --name yolov7_flash_p5 --hyp data/hyp.scratch.p5.yaml

Transfer learning with yolov7_training.pt

python train.py --workers 8 --device 0 --batch-size 32 --epochs 100 --data data/custom.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '/home/epbox/AI/pre_weights/yolov7/yolov7_training.pt' --name yolov7_flash --hyp data/hyp.scratch.custom.yaml

Aug 05 '22 07:08 Digital2Slave

Just use lower --conf 0.10 or --conf 0.025 and YOLOv7 will still have more Correct predictions and less Wrong predictions than YOLOv2/3/4/5/...

Main issue, you are only looking for True Positives in 1 image, while you don't take into account other million images where there are no people, but YOLOv2/3/4/5/... will predict people, so there will be more False Positives. This is why we measure accuracy on ~20000 MS COCO test-dev images using an AP-metric that considers True/False Positive/Negatives for all possible confidence thresholds.

As @WongKinYiu said, YOLOv7 uses IoU(pred_box, target_box) as target for objectness so loss-function

for YOLOv4 obj_loss = nn.BCEWithLogitsLoss(predicted_objectness, 1.0)
for YOLOv7 obj_loss = nn.BCEWithLogitsLoss(predicted_objectness, (1.0 - IoU(pred, target)) so it reduces predicted Confidence Score

Therefore there are 2 ways:

Preferably re-train YOLOv7 with model.gr = 1.0 https://github.com/WongKinYiu/yolov7/blob/main/train.py#L288
Or just run detection with 2x, 5x or 10x times lower confidence threshold

I just ran prediction using YOLO v2, v3, v4, v7, v7-e6e, v7(thresh=0.025), v7-e6e(thresh=0.025) And you can see that v7(thresh=0.025), v7-e6e(thresh=0.025) predicts almost all persons (and several backpacks, handbags, cell phones) and have more correct predictions and less wrong predictions than previous versions v2-v4.

For all models I use 640x640 resolution, while for v7-e6e I use 1280x1280.

YOLOv2: darknet.exe detector test data/coco.data cfg/yolov2.cfg yolov2.weights -thresh 0.25 -ext_output crowd.png

yolov2

YOLOv3: darknet.exe detector test data/coco.data cfg/yolov3.cfg yolov3.weights -thresh 0.25 -ext_output crowd.png

yolov3

YOLOv4: darknet.exe detector test data/coco.data cfg/yolov4.cfg yolov4.weights -thresh 0.25 -ext_output crowd.png

yolov4

YOLOv7 conf 0.25: python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --device 0 --source crowd.png --view-img

yolov7

YOLOv7-e6e conf 0.25: python detect.py --weights yolov7.pt --conf 0.25 --img-size 1280--device 0 --source crowd.png --view-img

yolov7_e6e

YOLOv7 conf 0.025: python detect.py --weights yolov7.pt --conf 0.025 --img-size 640 --device 0 --source crowd.png --view-img

yolov7_conf_0_025

YOLOv7-e6e conf 0.025: python detect.py --weights yolov7.pt --conf 0.025 --img-size 1280--device 0 --source crowd.png --view-img

yolov7_e6e_conf_0_025

Aug 05 '22 22:08 AlexeyAB

It mainly because we use iou as target of objectness, you could reduce self.gr to avoid this issue.

By the way, my testing result of yolov7 is as below.

Is there a recommended setting value for self.gr ？Thanks.

Nov 22 '22 11:11 PearlDzzz

same question

It mainly because we use iou as target of objectness, you could reduce self.gr to avoid this issue. By the way, my testing result of yolov7 is as below.

Is there a recommended setting value for self.gr ？Thanks.

same Q, what is the recommended value for model.gr ? setting the value too high or too low to suite an image would produce poor results on other images

Nov 23 '22 07:11 sadimoodi

Hi @glenn-jocher !

I found an interesting issue and I tested it with YOLOv5. Recent detection models predicting localization accuracy for detection scores have weak performance in case many objects are overlapped.

YOLOv5x conf 0.25 crowd_people