automl icon indicating copy to clipboard operation
automl copied to clipboard

Unable to do evaluation on the trained model.

Open Koteswara-ML opened this issue 4 years ago • 10 comments

Hey i have trained the model on the TPU efficientdet-d4. But while i was trying to run the evaluation script on those generated checkpoints by using this command python main.py --mode=eval --training_file_pattern=tfrecord/*.tfrecord --validation_file_pattern=tfrecord/*.tfrecord --model_name=efficientdet-d4 --model_dir=efficientdet-d4-finetune --ckpt=efficientdet-d4 --train_batch_size=32 --eval_batch_size=32 --eval_samples=1024 --num_examples_per_epoch=5717 --num_epochs=50 --hparams=voc_config.yaml

It is throwing this error (0) Invalid argument: Incompatible shapes: [32,810,128,128] vs. [32,9,128,128] [[node focal_loss/logistic_loss/GreaterEqual (defined at /home/koteswara_rao/automl_versio1/efficientdet/det_model_fn.py:190) ]] [[strided_slice_3/_8201]] (1) Invalid argument: Incompatible shapes: [1,810,128,128] vs. [1,9,128,128] [[node focal_loss/logistic_loss/GreaterEqual (defined at /home/koteswara_rao/automl_versio1/efficientdet/det_model_fn.py:190) ]] 0 successful operations. 0 derived errors ignored.

I am using tensorflow 2.1.0

Koteswara-ML avatar Jun 28 '20 17:06 Koteswara-ML

@mingxingtan can you please help on this, as this is little urgent

Koteswara-ML avatar Jun 28 '20 17:06 Koteswara-ML

Hi @Koteswara-ML Apparently, it is because num_classes is different at train and eval time: one is 90 classes and the other is 1 classes.

mingxingtan avatar Jun 28 '20 18:06 mingxingtan

I made a change in the voc_config.yaml to make it num_classes: 2 var_freeze_expr: '(efficientnet|fpn_cells|resample_p6)' label_id_mapping: {0: background, 1: person} in both training and testing then i am getting this

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000

What might be the issue..?

Koteswara-ML avatar Jun 29 '20 03:06 Koteswara-ML

What should be the changes @mingxingtan inorder to train on one label that is person.

Koteswara-ML avatar Jun 29 '20 14:06 Koteswara-ML

@Koteswara-ML Can you make sure your data is correct? I tried pascal voc one class (trained only 400 examples) and got non-zero accuracy.

mingxingtan avatar Jul 01 '20 19:07 mingxingtan

If your data is indeed correct, you can debug the issue by printing out the detections here: https://github.com/google/automl/blob/14548b7175e093c9f0586b372180c41ffc04fbc1/efficientdet/anchors.py#L543

mingxingtan avatar Jul 01 '20 19:07 mingxingtan

Hey @mingxingtan i have been using caltec data set for object detection , by converting it into VOC format. The data is so sparse and only 40 % of the images have detections in them. Do you think there might be issue with that..?

But yeah , as you said there was issue with dataset..I used the same script on MOT dataset which gave nonzero results and seemed fine. I was also assuming that there might be an issue with the dataset itself.

Koteswara-ML avatar Jul 02 '20 03:07 Koteswara-ML

One more thing, Why the number of detections that are getting considered are (number of detections * 100) How to make it consider the actual number of detections.

Koteswara-ML avatar Jul 02 '20 17:07 Koteswara-ML

I made a change in the voc_config.yaml to make it num_classes: 2 var_freeze_expr: '(efficientnet|fpn_cells|resample_p6)' label_id_mapping: {0: background, 1: person} in both training and testing then i am getting this

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000

What might be the issue..?

Hey I converted COCO data with 90 classes and I am also facing the same issue. Have we figured out the solution?

NeetiG26 avatar Jul 29 '20 17:07 NeetiG26

@Koteswara-ML I am currently trying to run it on MOT15 Dataset with Efficientdet-D0 and I am getting the MAP and AP values as -1. Can you please share your config.yaml file for this?

Also, did you use D0 or D4?

Please hope me with this.

BhandarkarPawan avatar Nov 24 '21 07:11 BhandarkarPawan