co-mod-gan icon indicating copy to clipboard operation
co-mod-gan copied to clipboard

Question about custom dataset preparing, tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found.

Open LigZhong opened this issue 3 years ago • 3 comments

Hello, thanks for your great job. I have encountered a problem when trying to do training with my own dataset. I created tfrecord with my own dataset(with jpg files only). I run python scripts as indicated but when I run the training code, there is such an error:

python3 run_training.py --data-dir ./dataset --dataset custom_3 --num-gpus 1 --metrics=ids36k5 --total-kimg 5000 Local submit - run_dir: results/00022-co-mod-gan-custom_3-1gpu dnnlib: Running training.training_loop.training_loop() on localhost... Streaming data using training.dataset.TFRecordDataset... tfrecord_dir: dataset/custom_3 max_shape: [3, 256, 256] Dataset shape = [3, 256, 256] Dynamic range = [0, 255] Label size = 0

Building TensorFlow graph... Initializing logs... Training for 50000 kimg...

Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(*args) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.OutOfRangeError: 2 root error(s) found. (0) Out of range: End of sequence [[{{node GPU0/DataFetch/IteratorGetNext}}]] (1) Out of range: End of sequence [[{{node GPU0/DataFetch/IteratorGetNext}}]] [[GPU0/DataFetch/IteratorGetNext/_2837]] 0 successful operations. 0 derived errors ignored.

Can anyone give a hint?

LigZhong avatar Jun 26 '21 09:06 LigZhong

I also get 'Out of range: End of sequence' error. I added validation dir in dataset preparing, and this error has been solved. You can try it:

python dataset_tools/create_from_images.py --train-image-dir ./imgs/png_samples/ --val-image-dir ./imgs/png_samples/ --tfrecord-dir ./train_dataset --resolution 512 --num-channels 3

liupgd avatar Jun 28 '21 08:06 liupgd

i think you are trying to use more than 1 gpu and you only have one

mostafa610 avatar Jun 28 '21 21:06 mostafa610

Hello, may I ask if you have solved this problem?

zzz105120 avatar Mar 26 '22 05:03 zzz105120