TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10 icon indicating copy to clipboard operation
TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10 copied to clipboard

error First step cannot be zero when running train.py

Open blockhunts opened this issue 6 years ago • 20 comments

i tried to use the same images (card) provided, i just delete all the processed file (csv,dll) and follow all the step. And when i tried to issue python train.py I got this error

Traceback (most recent call last):
  File "train.py", line 184, in <module>
    tf.app.run()
  File "C:\Users\MRCPP-Fablab\AppData\Local\conda\conda\envs\tensorflow\lib\site-packages\tensorflow\python\platform\app.py", line 126, in run
    _sys.exit(main(argv))
  File "train.py", line 180, in main
    graph_hook_fn=graph_rewriter_fn)
  File "E:\tensor\models\research\object_detection\trainer.py", line 288, in train
    train_config.optimizer)
  File "E:\tensor\models\research\object_detection\builders\optimizer_builder.py", line 50, in build
    learning_rate = _create_learning_rate(config.learning_rate)
  File "E:\tensor\models\research\object_detection\builders\optimizer_builder.py", line 109, in _create_learning_rate
    learning_rate_sequence, config.warmup)
  File "E:\tensor\models\research\object_detection\utils\learning_schedules.py", line 156, in manual_stepping
    raise ValueError('First step cannot be zero.')
ValueError: First step cannot be zero.

Any clues why this happen?

blockhunts avatar May 28 '18 02:05 blockhunts

I have the same error.Do you find how to solve it?

Surasi-Jui avatar Jun 03 '18 11:06 Surasi-Jui

yes, edit this in your config file in ...\models\research\object_detection\training

train_config: {
  batch_size: 1
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        manual_step_learning_rate {
          initial_learning_rate: 0.0002
          schedule {
            step: 900000
            learning_rate: .00002
          }
          schedule {
            step: 1200000
            learning_rate: .000002
          }
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }

blockhunts avatar Jun 04 '18 09:06 blockhunts

Thank you. It work here :)

On Mon, 4 Jun 2561 at 16:52 blockhunts [email protected] wrote:

yes, edit this in your config file in ...\models\research\object_detection\training

train_config: { batch_size: 1 optimizer { momentum_optimizer: { learning_rate: { manual_step_learning_rate { initial_learning_rate: 0.0002 schedule { step: 900000 learning_rate: .00002 } schedule { step: 1200000 learning_rate: .000002 } } } momentum_optimizer_value: 0.9 } use_moving_average: false }

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10/issues/51#issuecomment-394298738, or mute the thread https://github.com/notifications/unsubscribe-auth/AmB6BPsiQVe1w3fV6oSj5jUW-ATlE7pEks5t5QNIgaJpZM4UPjDR .

Surasi-Jui avatar Jun 04 '18 10:06 Surasi-Jui

if you download model from the github repository files are up to date

leccyril avatar Jun 28 '18 07:06 leccyril

I ran into this same error while using the AWS DL AMI (Deep Learning AMI (Ubuntu) Version 10.0 (ami-23c4fb46)) and following, as far as I can tell, the same steps I used on Windows with obvious substitutions since this AMI is Ubuntu. Both Ubuntu and Windows are using TF 1.8. But when I use the train_config that blockhunts mentioned I get: Traceback (most recent call last): File "/ml/models/research/object_detection/train.py", line 184, in tf.app.run() File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/platform/app.py", line 126, in run _sys.exit(main(argv)) File "/ml/models/research/object_detection/train.py", line 180, in main graph_hook_fn=graph_rewriter_fn) File "/ml/models/research/object_detection/trainer.py", line 298, in train train_config.optimizer) File "/ml/models/research/object_detection/builders/optimizer_builder.py", line 50, in build learning_rate = _create_learning_rate(config.learning_rate) File "/ml/models/research/object_detection/builders/optimizer_builder.py", line 109, in _create_learning_rate learning_rate_sequence, config.warmup) File "/ml/models/research/object_detection/utils/learning_schedules.py", line 169, in manual_stepping [0] * num_boundaries)) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 2681, in where return gen_math_ops.select(condition=condition, x=x, y=y, name=name) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 6699, in select "Select", condition=condition, t=x, e=y, name=name) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 528, in _apply_op_helper (input_name, err)) ValueError: Tried to convert 't' to a tensor and failed. Error: Argument must be a dense tensor: range(0, 3) - got shape [3], but wanted [].

Any ideas?

jim-meyer avatar Jun 30 '18 20:06 jim-meyer

I see that epratheeban has the solution to my problem mentioned here https://github.com/EdjeElectronics/TensorFlow-Object-Detection-API-Tutorial-Train-Multiple-Objects-Windows-10/issues/11:

It's easy. Go to the utils folder. Find the learning_schedules.py file. Go to the line 167. And replace the line 167 with below

rate_index = tf.reduce_max(tf.where(tf.greater_equal(global_step, boundaries), list(range(num_boundaries)), [0] * num_boundaries))

jim-meyer avatar Jun 30 '18 20:06 jim-meyer

Hi @jim-meyer I make this change and the problem solved but now returned this error

WARNING:tensorflow:From C:\Users\sadegh\Anaconda3\envs\tensorflow1\lib\site-pack ages\object_detection-0.1-py3.5.egg\object_detection\core\losses.py:317: softmax _cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version. Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow into the labels input on backprop by default.

See @{tf.nn.softmax_cross_entropy_with_logits_v2}.

Traceback (most recent call last): File "train.py", line 184, in tf.app.run() File "C:\Users\sadegh\Anaconda3\envs\tensorflow1\lib\site-packages\tensorflow
python\platform\app.py", line 126, in run _sys.exit(main(argv)) File "train.py", line 180, in main graph_hook_fn=graph_rewriter_fn) File "C:\Users\sadegh\Anaconda3\envs\tensorflow1\lib\site-packages\object_dete ction-0.1-py3.5.egg\object_detection\trainer.py", line 288, in train train_config.optimizer) File "C:\Users\sadegh\Anaconda3\envs\tensorflow1\lib\site-packages\object_dete ction-0.1-py3.5.egg\object_detection\builders\optimizer_builder.py", line 50, in build learning_rate = _create_learning_rate(config.learning_rate) File "C:\Users\sadegh\Anaconda3\envs\tensorflow1\lib\site-packages\object_dete ction-0.1-py3.5.egg\object_detection\builders\optimizer_builder.py", line 109, i n _create_learning_rate learning_rate_sequence, config.warmup) File "C:\Users\sadegh\Anaconda3\envs\tensorflow1\lib\site-packages\object_dete ction-0.1-py3.5.egg\object_detection\utils\learning_schedules.py", line 168, in manual_stepping list(num_boundaries), TypeError: 'int' object is not iterable

aghapesar1374 avatar Jul 10 '18 20:07 aghapesar1374

TypeError: Cannot convert a list containing a tensor of dtype <dtype: 'int32'> to <dtype: 'float32'> (Tensor is: <tf.Tensor 'Preprocessor/stack_1:0' shape=(1, 3) dtype=int32>)

tamizharasank avatar Jul 20 '18 11:07 tamizharasank

@tamizharasank what file ? this kind of error copy it in google you will find the fix easily

leccyril avatar Jul 20 '18 11:07 leccyril

@tamizharasank did you solve this error? I got the same error, any suggesstions?

Adibhatt95 avatar Jul 23 '18 16:07 Adibhatt95

After making changes in configure file in training folder I got this error:

(tensorflow1) C:\tensorflow1\models\research\object_detection>python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/faster_rcnn_inception_v2_pets.config WARNING:tensorflow:From C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\platform\app.py:125: main (from main) is deprecated and will be removed in a future version. Instructions for updating: Use object_detection/model_main.py. WARNING:tensorflow:From C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\object_detection-0.1-py3.5.egg\object_detection\legacy\trainer.py:266: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.train.create_global_step WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards. INFO:tensorflow:Scale of 0 disables regularizer. INFO:tensorflow:Scale of 0 disables regularizer. INFO:tensorflow:depth of additional conv before box predictor: 0 WARNING:tensorflow:From C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\object_detection-0.1-py3.5.egg\object_detection\predictors\heads\box_head.py:93: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version. Instructions for updating: keep_dims is deprecated, use keepdims instead INFO:tensorflow:Scale of 0 disables regularizer. INFO:tensorflow:Scale of 0 disables regularizer. WARNING:tensorflow:From C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\object_detection-0.1-py3.5.egg\object_detection\core\losses.py:345: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version. Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow into the labels input on backprop by default.

See @{tf.nn.softmax_cross_entropy_with_logits_v2}.

C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\ops\gradients_impl.py:108: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory. "Converting sparse IndexedSlices to a dense Tensor of unknown shape. " WARNING:tensorflow:From C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\object_detection-0.1-py3.5.egg\object_detection\meta_architectures\faster_rcnn_meta_arch.py:2236: get_or_create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.train.get_or_create_global_step Traceback (most recent call last): File "train.py", line 184, in tf.app.run() File "C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run _sys.exit(main(argv)) File "C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\util\deprecation.py", line 272, in new_func return func(*args, **kwargs) File "train.py", line 180, in main graph_hook_fn=graph_rewriter_fn) File "C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\object_detection-0.1-py3.5.egg\object_detection\legacy\trainer.py", line 397, in train include_global_step=False)) File "C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\object_detection-0.1-py3.5.egg\object_detection\utils\variables_helper.py", line 126, in get_variables_available_in_checkpoint ckpt_reader = tf.train.NewCheckpointReader(checkpoint_path) File "C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\pywrap_tensorflow_internal.py", line 306, in NewCheckpointReader return CheckpointReader(compat.as_bytes(filepattern), status) File "C:\Users\kayka\Anaconda3\envs\tensorflow1\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 519, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.InvalidArgumentError: Unsuccessful TensorSliceReader constructor: Failed to get matching files on C:/tensorflow1/models/research/object_detection/faster_rcnn_inception_v2_coco_2018_01_28/model.ckpt: Not found: FindFirstFile failed for: C:/tensorflow1/models/research/object_detection/faster_rcnn_inception_v2_coco_2018_01_28 : The system cannot find the path specified. ; No such process

Kkaranmore avatar Feb 09 '19 05:02 Kkaranmore

Looks like you probably did not follow all of the steps in 2a, "Download TensorFlow Object Detection API repository from GitHub" and/or 2b, "Download the Faster-RCNN-Inception-V2-COCO model from TensorFlow's model zoo". Try following those steps again exactly and that should fix your problem.

jim-meyer avatar Feb 10 '19 14:02 jim-meyer

File "C:\tensorflow1\models\research\object_detection\utils\learning_schedules.py", line 160, in manual_stepping raise ValueError('First step cannot be zero.') ValueError: First step cannot be zero.

i edit the file and save it and when i train it again it's return to it's original value

mohamedelsiesyibra avatar Mar 15 '19 09:03 mohamedelsiesyibra

I'm getting below error while i was trying to run: python train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_inception_v2_coco.config

WARNING: The TensorFlow contrib module will not be included in TensorFlow 2.0. For more information, please see:

  • https://github.com/tensorflow/community/blob/master/rfcs/20180907-contrib-sunset.md
  • https://github.com/tensorflow/addons If you depend on functionality not listed there, please file an issue.

WARNING:tensorflow:From C:\Users\Asus\Miniconda3\lib\site-packages\tensorflow\python\platform\app.py:125: main (from main) is deprecated and will be removed in a future version. Instructions for updating: Use object_detection/model_main.py. WARNING:tensorflow:From C:\Tensorflow\models\research\object_detection\legacy\trainer.py:266: create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version. Instructions for updating: Please switch to tf.train.create_global_step WARNING:tensorflow:From C:\Users\Asus\Miniconda3\lib\site-packages\tensorflow\python\framework\op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer.

Traceback (most recent call last): File "train.py", line 184, in tf.app.run() File "C:\Users\Asus\Miniconda3\lib\site-packages\tensorflow\python\platform\app.py", line 125, in run _sys.exit(main(argv)) File "C:\Users\Asus\Miniconda3\lib\site-packages\tensorflow\python\util\deprecation.py", line 324, in new_func return func(*args, **kwargs) File "train.py", line 180, in main graph_hook_fn=graph_rewriter_fn) File "C:\Tensorflow\models\research\object_detection\legacy\trainer.py", line 280, in train train_config.prefetch_queue_capacity, data_augmentation_options) File "C:\Tensorflow\models\research\object_detection\legacy\trainer.py", line 59, in create_input_queue tensor_dict = create_tensor_dict_fn() File "train.py", line 121, in get_next dataset_builder.build(config)).get_next() File "C:\Tensorflow\models\research\object_detection\builders\dataset_builder.py", line 124, in build num_additional_channels=input_reader_config.num_additional_channels) File "C:\Tensorflow\models\research\object_detection\data_decoders\tf_example_decoder.py", line 307, in init default_value=''), File "C:\Tensorflow\models\research\object_detection\data_decoders\tf_example_decoder.py", line 59, in init label_map_proto_file, use_display_name=False) File "C:\Tensorflow\models\research\object_detection\utils\label_map_util.py", line 164, in get_label_map_dict label_map = load_labelmap(label_map_path) File "C:\Tensorflow\models\research\object_detection\utils\label_map_util.py", line 133, in load_labelmap label_map_string = fid.read() File "C:\Users\Asus\Miniconda3\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 125, in read self._preread_check() File "C:\Users\Asus\Miniconda3\lib\site-packages\tensorflow\python\lib\io\file_io.py", line 85, in _preread_check compat.as_bytes(self.__name), 1024 * 512, status) File "C:\Users\Asus\Miniconda3\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 528, in exit c_api.TF_GetCode(self.status.status)) tensorflow.python.framework.errors_impl.InvalidArgumentError: NewRandomAccessFile failed to Create/Open: C:\Tensorflow\workspace raining_demonnotations/label_map.pbtxt : The filename, directory name, or volume label syntax is incorrect. ; Unknown error

bebop-boop avatar Jun 01 '19 07:06 bebop-boop

@ShubhranshuMaurya that error seems to indicate that there is something wrong with C:\Tensorflow\workspace raining_demonnotations/label_map.pbtxt. Have you opened that file in a text editor to see if it looks right? That file file should look something like this: item { name: 'Class1' id: 1 display_name: 'Class1 Label Name' }

item { name: 'Class2' id: 2 display_name: 'Class2 Label Name' }

IIRC this file could also be a binary protobuf file in which case viewing it in a text editor won't tell you much. But if it appears to be binary perhaps you could try creating a text version with your training labels and see if that works.

jim-meyer avatar Jun 01 '19 20:06 jim-meyer

#tessor flow custom training

ERROR:raise ValueError('First step cannot be zero.') ValueError: First step cannot be zero.

SOLUTION: object_detection\training\ .config

train_config: { batch_size: 1 optimizer { momentum_optimizer: { learning_rate: { manual_step_learning_rate { initial_learning_rate: 0.0002 schedule { step: 900000 learning_rate: .00002 } schedule { step: 1200000 learning_rate: .000002 } } } momentum_optimizer_value: 0.9 } use_moving_average: false }

bharath5673 avatar Jul 06 '19 14:07 bharath5673

For me it worked with 'step: 1' for some reason there was 'step: 0'...

Arri avatar Aug 30 '19 04:08 Arri

TypeError: Cannot convert a list containing a tensor of dtype <dtype: 'int32'> to <dtype: 'float32'> (Tensor is: <tf.Tensor 'Preprocessor/stack_1:0' shape=(1, 3) dtype=int32>)

Did you find a solution?

siddas27 avatar Dec 24 '19 08:12 siddas27

yes, edit this in your config file in ...\models\research\object_detection\training

train_config: {
  batch_size: 1
  optimizer {
    momentum_optimizer: {
      learning_rate: {
        manual_step_learning_rate {
          initial_learning_rate: 0.0002
          schedule {
            step: 900000
            learning_rate: .00002
          }
          schedule {
            step: 1200000
            learning_rate: .000002
          }
        }
      }
      momentum_optimizer_value: 0.9
    }
    use_moving_average: false
  }

can you explain what is happening in learning rate?, what does the both step size signify in manual learning rate and also what is initial learning rate?

dpbnasika avatar Jun 23 '20 23:06 dpbnasika

python train.py --logtostderr -train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v2_quantized_300x300_coco.config

Current thread 0x00005734 (most recent call first): File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\tensorflow_core\python\lib\io\file_io.py", line 84 in _preread_check File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\tensorflow_core\python\lib\io\file_io.py", line 122 in read File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\utils\label_map_util.py", line 168 in load_labelmap File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\utils\label_map_util.py", line 201 in get_label_map_dict File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\data_decoders\tf_example_decoder.py", line 93 in init File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\data_decoders\tf_example_decoder.py", line 460 in init File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\builders\decoder_builder.py", line 63 in build File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\builders\dataset_builder.py", line 209 in build File "train.py", line 123 in get_next File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\legacy\trainer.py", line 58 in create_input_queue File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\object_detection-0.1-py3.7.egg\object_detection\legacy\trainer.py", line 279 in train File "train.py", line 182 in main File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 324 in new_func File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\absl\app.py", line 258 in _run_main File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\absl\app.py", line 312 in run File "C:\Users\EMRE\anaconda3\envs\gpuemre\lib\site-packages\tensorflow_core\python\platform\app.py", line 40 in run File "train.py", line 186 in

help

EMRYLMZ1 avatar Apr 11 '22 01:04 EMRYLMZ1