TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10 icon indicating copy to clipboard operation
TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10 copied to clipboard

Error while training custom model with ssd_inception_v2_coco

Open shABanty opened this issue 5 years ago • 7 comments

(tensorflow_cpu)C:\Users\AB\Tensorflow\workplace\training_demo>python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_inception_v2_coco.config

WARNING:tensorflow:From C:\Users\AB\Anaconda3\envs\tensorflow_cpu\lib\site-packages\tensorflow\python\platform\app.py:125: main (from main) is deprecated and will be removed in a future version. Instructions for updating: Use object_detection/model_main.py. WARNING:tensorflow:From C:\Users\AB\Anaconda3\envs\tensorflow_cpu\lib\site-packages\object_detection\legacy\trainer.py:266: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.train.create_global_step WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards. WARNING:tensorflow:From C:\Users\AB\Anaconda3\envs\tensorflow_cpu\lib\site-packages\object_detection\core\preprocessor.py:1240: calling squeeze (from tensorflow.python.ops.array_ops) with squeeze_dims is deprecated and will be removed in a future version. Instructions for updating: Use the axis argument instead INFO:tensorflow:depth of additional conv before box predictor: 0 INFO:tensorflow:depth of additional conv before box predictor: 0 INFO:tensorflow:depth of additional conv before box predictor: 0 INFO:tensorflow:depth of additional conv before box predictor: 0 INFO:tensorflow:depth of additional conv before box predictor: 0 INFO:tensorflow:depth of additional conv before box predictor: 0 WARNING:tensorflow:From C:\Users\AB\Anaconda3\envs\tensorflow_cpu\lib\site-packages\tensorflow\contrib\slim\python\slim\learning.py:737: Supervisor.init (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.train.MonitoredTrainingSession 2020-02-07 01:57:41.478395: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 INFO:tensorflow:Restoring parameters from C:\Users\AB\Tensorflow\workplace\training_demo\pre-trained-model\model.ckpt INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op. INFO:tensorflow:Starting Session. INFO:tensorflow:Saving checkpoint to path training/model.ckpt INFO:tensorflow:Starting Queues. INFO:tensorflow:global_step/sec: 0 2020-02-07 01:58:16.645162: I T:\src\github\tensorflow\tensorflow\core\kernels\data\shuffle_dataset_op.cc:95] Filling up shuffle buffer (this may take a while): 480 of 2048 2020-02-07 01:58:26.795941: I T:\src\github\tensorflow\tensorflow\core\kernels\data\shuffle_dataset_op.cc:95] Filling up shuffle buffer (this may take a while): 633 of 2048 2020-02-07 0

shABanty avatar Feb 06 '20 18:02 shABanty

I'm using the model here: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md

and config here: https://github.com/tensorflow/models/blob/master/research/object_detection/samples/configs/ssd_inception_v2_coco.config

shABanty avatar Feb 06 '20 18:02 shABanty

config looks like this:

model { ssd { num_classes: 10 box_coder { faster_rcnn_box_coder { y_scale: 10.0 x_scale: 10.0 height_scale: 5.0 width_scale: 5.0 } } matcher { argmax_matcher { matched_threshold: 0.5 unmatched_threshold: 0.5 ignore_thresholds: false negatives_lower_than_unmatched: true force_match_for_each_row: true } } similarity_calculator { iou_similarity { } } anchor_generator { ssd_anchor_generator { num_layers: 6 min_scale: 0.2 max_scale: 0.95 aspect_ratios: 1.0 aspect_ratios: 2.0 aspect_ratios: 0.5 aspect_ratios: 3.0 aspect_ratios: 0.3333 reduce_boxes_in_lowest_layer: true } } image_resizer { fixed_shape_resizer { height: 300 width: 300 } } box_predictor { convolutional_box_predictor { min_depth: 0 max_depth: 0 num_layers_before_predictor: 0 use_dropout: false dropout_keep_probability: 0.8 kernel_size: 3 box_code_size: 4 apply_sigmoid_to_scores: false conv_hyperparams { activation: RELU_6, regularizer { l2_regularizer { weight: 0.00004 } } initializer { truncated_normal_initializer { stddev: 0.03 mean: 0.0 } } } } } feature_extractor { type: 'ssd_inception_v2' min_depth: 16 depth_multiplier: 1.0 conv_hyperparams { activation: RELU_6, regularizer { l2_regularizer { weight: 0.00004 } } initializer { truncated_normal_initializer { stddev: 0.03 mean: 0.0 } } batch_norm { train: true, scale: true, center: true, decay: 0.9997, epsilon: 0.001, } } override_base_feature_extractor_hyperparams: true } loss { classification_loss { weighted_sigmoid { } } localization_loss { weighted_smooth_l1 { } } hard_example_miner { num_hard_examples: 3000 iou_threshold: 0.99 loss_type: CLASSIFICATION max_negatives_per_positive: 3 min_negatives_per_image: 0 } classification_weight: 1.0 localization_weight: 1.0 } normalize_loss_by_num_matches: true post_processing { batch_non_max_suppression { score_threshold: 1e-8 iou_threshold: 0.6 max_detections_per_class: 100 max_total_detections: 100 } score_converter: SIGMOID } } }

train_config: { batch_size: 10 optimizer { rms_prop_optimizer: { learning_rate: { exponential_decay_learning_rate { initial_learning_rate: 0.004 decay_steps: 800720 decay_factor: 0.95 } } momentum_optimizer_value: 0.9 decay: 0.9 epsilon: 1.0 } } fine_tune_checkpoint: "C:\Users\AB\Tensorflow\workplace\training_demo\pre-trained-model\model.ckpt" from_detection_checkpoint: true # Note: The below line limits the training process to 200K steps, which we # empirically found to be sufficient enough to train the pets dataset. This # effectively bypasses the learning rate schedule (the learning rate will # never decay). Remove the below line to train indefinitely. num_steps: 200000 fine_tune_checkpoint:"C:\Users\AB\Tensorflow\workplace\training_demo\pre-trained-model\model.ckpt" fine_tune_checkpoint_type: "detection" data_augmentation_options { random_horizontal_flip { } } data_augmentation_options { ssd_random_crop { } } }

train_input_reader: { tf_record_input_reader { input_path: "C:\Users\AB\Tensorflow\workplace\training_demo\annotations\train.record" } label_map_path: "C:\Users\AB\Tensorflow\workplace\training_demo\annotations\label_map.pbtxt" }

eval_config: { num_examples: 8000 # Note: The below line limits the evaluation process to 10 evaluations. # Remove the below line to evaluate indefinitely. max_evals: 10 }

eval_input_reader: { tf_record_input_reader { input_path: "C:\Users\AB\Tensorflow\workplace\training_demo\annotations\test.record" } label_map_path: "C:\Users\AB\Tensorflow\workplace\training_demo\annotations\label_map.pbtxt" shuffle: false num_readers: 1 }

shABanty avatar Feb 06 '20 18:02 shABanty

First Change your path in the config with forward slash from : C:\Users\AB\Tensorflow\workplace\training_demo\annotations\train.record to : C:/Users/AB/Tensorflow/workplace/training_demo/annotations/train.record

Ahmadzia307 avatar Feb 07 '20 20:02 Ahmadzia307

@Ahmadzia307 Thanks for your reply. I've changed the path format. I'm now using pretrain model- ssd inception v2 coco instead and I don't have the error like the one above, but my training does not start and no loss value is shown as follows.

(tensorflow_cpu) C:\Users\AB\Tensorflow\workplace\training_demo>python train.py \ --logtostderr \ --train_dir=train \ --pipeline_config_path=training\ssd_inception_v2_coco.config WARNING:tensorflow:From C:\Users\AB\Anaconda3\envs\tensorflow_cpu\lib\site-packages\tensorflow\python\platform\app.py:125: main (from main) is deprecated and will be removed in a future version. Instructions for updating: Use object_detection/model_main.py. WARNING:tensorflow:From C:\Users\AB\Anaconda3\envs\tensorflow_cpu\lib\site-packages\object_detection\legacy\trainer.py:266: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.train.create_global_step WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards. WARNING:tensorflow:From C:\Users\AB\Anaconda3\envs\tensorflow_cpu\lib\site-packages\object_detection\core\preprocessor.py:1240: calling squeeze (from tensorflow.python.ops.array_ops) with squeeze_dims is deprecated and will be removed in a future version. Instructions for updating: Use the axis argument instead INFO:tensorflow:depth of additional conv before box predictor: 0 INFO:tensorflow:depth of additional conv before box predictor: 0 INFO:tensorflow:depth of additional conv before box predictor: 0 INFO:tensorflow:depth of additional conv before box predictor: 0 INFO:tensorflow:depth of additional conv before box predictor: 0 INFO:tensorflow:depth of additional conv before box predictor: 0 WARNING:tensorflow:From C:\Users\AB\Anaconda3\envs\tensorflow_cpu\lib\site-packages\tensorflow\contrib\slim\python\slim\learning.py:737: Supervisor.init (from tensorflow.python.training.supervisor) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.train.MonitoredTrainingSession 2020-02-08 12:34:31.414876: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 INFO:tensorflow:Restoring parameters from C:/Users/AB/Tensorflow/workplace/training_demo/pre-trained-model/model.ckpt INFO:tensorflow:Running local_init_op. INFO:tensorflow:Done running local_init_op. INFO:tensorflow:Starting Session. INFO:tensorflow:Saving checkpoint to path train\model.ckpt INFO:tensorflow:Starting Queues. INFO:tensorflow:global_step/sec: 0 2020-02-08 12:35:02.919828: I T:\src\github\tensorflow\tensorflow\core\kernels\data\shuffle_dataset_op.cc:95] Filling up shuffle buffer (this may take a while): 480 of 2048 2020-02-08 12:35:12.815315: I T:\src\github\tensorflow\tensorflow\core\kernels\data\shuffle_dataset_op.cc:95] Filling up shuffle buffer (this may take a while): 673 of 2048 2020-02-08 12:35:22.678793: I T:\src\github\tensorflow\tensorflow\core\kernels\data\shuffle_dataset_op.cc:95] Filling up shuffle buffer (this may take a while): 800 of 2048 2020-02-08 12:35:32.710125: I T:\src\github\tensorflow\tensorflow\core\kernels\data\shuffle_dataset_op.cc:95] Filling up shuffle buffer (this may take a while): 1019 of 2048

shABanty avatar Feb 08 '20 04:02 shABanty

config

model { ssd { num_classes: 10 box_coder { faster_rcnn_box_coder { y_scale: 10.0 x_scale: 10.0 height_scale: 5.0 width_scale: 5.0 } } matcher { argmax_matcher { matched_threshold: 0.5 unmatched_threshold: 0.5 ignore_thresholds: false negatives_lower_than_unmatched: true force_match_for_each_row: true } } similarity_calculator { iou_similarity { } } anchor_generator { ssd_anchor_generator { num_layers: 6 min_scale: 0.2 max_scale: 0.95 aspect_ratios: 1.0 aspect_ratios: 2.0 aspect_ratios: 0.5 aspect_ratios: 3.0 aspect_ratios: 0.3333 reduce_boxes_in_lowest_layer: true } } image_resizer { fixed_shape_resizer { height: 300 width: 300 } } box_predictor { convolutional_box_predictor { min_depth: 0 max_depth: 0 num_layers_before_predictor: 0 use_dropout: false dropout_keep_probability: 0.8 kernel_size: 3 box_code_size: 4 apply_sigmoid_to_scores: false conv_hyperparams { activation: RELU_6, regularizer { l2_regularizer { weight: 0.00004 } } initializer { truncated_normal_initializer { stddev: 0.03 mean: 0.0 } } } } } feature_extractor { type: 'ssd_inception_v2' min_depth: 16 depth_multiplier: 1.0 conv_hyperparams { activation: RELU_6, regularizer { l2_regularizer { weight: 0.00004 } } initializer { truncated_normal_initializer { stddev: 0.03 mean: 0.0 } } batch_norm { train: true, scale: true, center: true, decay: 0.9997, epsilon: 0.001, } } override_base_feature_extractor_hyperparams: true } loss { classification_loss { weighted_sigmoid { } } localization_loss { weighted_smooth_l1 { } } hard_example_miner { num_hard_examples: 3000 iou_threshold: 0.99 loss_type: CLASSIFICATION max_negatives_per_positive: 3 min_negatives_per_image: 0 } classification_weight: 1.0 localization_weight: 1.0 } normalize_loss_by_num_matches: true post_processing { batch_non_max_suppression { score_threshold: 1e-8 iou_threshold: 0.6 max_detections_per_class: 100 max_total_detections: 100 } score_converter: SIGMOID } } }

train_config: { batch_size: 1 optimizer { rms_prop_optimizer: { learning_rate: { exponential_decay_learning_rate { initial_learning_rate: 0.004 decay_steps: 800720 decay_factor: 0.95 } } momentum_optimizer_value: 0.9 decay: 0.9 epsilon: 1.0 } } fine_tune_checkpoint: "C:/Users/AB/Tensorflow/workplace/training_demo/pre-trained-model/model.ckpt" from_detection_checkpoint: true # Note: The below line limits the training process to 200K steps, which we # empirically found to be sufficient enough to train the pets dataset. This # effectively bypasses the learning rate schedule (the learning rate will # never decay). Remove the below line to train indefinitely. num_steps: 200000 fine_tune_checkpoint:"C:/Users/AB/Tensorflow/workplace/training_demo/pre-trained-model/model.ckpt" fine_tune_checkpoint_type: "detection" data_augmentation_options { random_horizontal_flip { } } data_augmentation_options { ssd_random_crop { } } }

train_input_reader: { tf_record_input_reader { input_path: "C:/Users/AB/Tensorflow/workplace/training_demo/annotations/train.record" } label_map_path: "C:/Users/AB/Tensorflow/workplace/training_demo/annotations/label_map.pbtxt" }

eval_config: { num_examples: 8000 # Note: The below line limits the evaluation process to 10 evaluations. # Remove the below line to evaluate indefinitely. max_evals: 10 }

eval_input_reader: { tf_record_input_reader { input_path: "C:/Users/AB/Tensorflow/workplace/training_demo/annotations/test.record" } label_map_path: "C:/Users/AB/Tensorflow/workplace/training_demo/annotations/label_map.pbtxt" shuffle: false num_readers: 1 }

shABanty avatar Feb 08 '20 04:02 shABanty

Hi, Do you solve this problem? i have the same problem

YuanTG avatar Apr 23 '20 08:04 YuanTG

@shABanty please do people the favor, not to post the whole config file or something, because if you gave them the link, they can see it themselves. Did you finally find a solution for it?

Petros626 avatar May 17 '22 05:05 Petros626