datasets
datasets copied to clipboard
Loading SUN397 crashed
/!\ IF YOU WANT PEOPLE TO HELP YOU, PLEASE GIVE AS MUCH DETAIL AS POSSIBLE, INCLUDING THE FULL STACKTRACE AND CODE SNIPPET
Hi there i'am trying to load the SUN397 Dataset to make a short image Classification NN. And when loading the Dataset I am getting an Error. When Tensorflow tries to load around the 55910-55920 Example, the Error: ValueError: Cannot take the length of shape with unknown rank Raises.
Windows 10 Pro Python Version: 3.8.6 tfds Version: 4.1.0 tfds-nightly Version: tensorflow Version: 2.4.0
Here is my Code, that I use to load the sun397 Dataset batch_size = 128
(train_ds, val_ds), info = tfds.load("sun397", split=["train[:55900]", "validation"], as_supervised=True, with_info=True)
Model
train_ds = train_ds.map(lambda img, label: (tf.image.resize(img, [img_width, img_height]) / 255.0, label)).shuffle(1024).batch(batch_size) val_ds = val_ds.map(lambda img, label: (tf.image.resize(img, [img_width, img_height]) / 255.0, label)).batch(batch_size)
Here are my Logs:
Generating train examples...: 55816 examples [16:46, 113.83 examples/s]
Generating train examples...: 55830 examples [16:46, 117.48 examples/s]
Generating train examples...: 55843 examples [16:46, 117.46 examples/s]
Generating train examples...: 55856 examples [16:46, 81.19 examples/s]
Generating train examples...: 55866 examples [16:46, 53.75 examples/s]
Generating train examples...: 55874 examples [16:47, 52.12 examples/s]
Generating train examples...: 55881 examples [16:47, 42.26 examples/s]
Generating train examples...: 55887 examples [16:47, 46.28 examples/s]
Generating train examples...: 55893 examples [16:47, 44.83 examples/s]
Generating train examples...: 55907 examples [16:47, 56.20 examples/s]
Generating train examples...: 55915 examples [16:48, 39.78 examples/s]
Generating train examples...: 55922 examples [16:48, 44.33 examples/s]WARNING:absl:Image /t/track/outdoor/sun_aophkoiosslinihb.jpg could not be decoded by OpenCV, falling back to TF
CRITICAL:absl:Image /t/track/outdoor/sun_aophkoiosslinihb.jpg could not be decoded by Tensorflow
Traceback (most recent call last):
File "C:/Users/Jann/OneDrive - UMB AG/Schule/IT/Semester 3/122/Projekt/Code/Test/TestModel.py", line 31, in <module>
(train_ds, val_ds), info = tfds.load("sun397", split=["train[:55900]", "validation"], as_supervised=True, with_info=True)
File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\core\load.py", line 328, in load
dbuilder.download_and_prepare(**download_and_prepare_kwargs)
File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\core\dataset_builder.py", line 432, in download_and_prepare
self._download_and_prepare(
File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\core\dataset_builder.py", line 1158, in _download_and_prepare
split_info_futures = [
File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\core\dataset_builder.py", line 1159, in <listcomp>
split_builder.submit_split_generation( # pylint: disable=g-complex-comprehension
File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\core\split_builder.py", line 295, in submit_split_generation
return self._build_from_generator(**build_kwargs)
File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\core\split_builder.py", line 354, in _build_from_generator
for key, example in utils.tqdm(
File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tqdm\std.py", line 1167, in __iter__
for obj in iterable:
File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\image_classification\sun.py", line 283, in _generate_examples
image = _process_image_file(
File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\image_classification\sun.py", line 123, in _process_image_file
image = _decode_image(fobj, session, filename=filename)
File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow_datasets\image_classification\sun.py", line 104, in _decode_image
if len(image.shape) == 4: # rank=4 -> rank=3
File "C:\Users\Jann\AppData\Local\Programs\Python\Python38\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 848, in __len__
raise ValueError("Cannot take the length of shape with unknown rank.")
ValueError: Cannot take the length of shape with unknown rank.
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: iCCP: known incorrect sRGB profile
libpng warning: Ignoring invalid time value
Process finished with exit code 1
Thanks for anyone, who has an idea whats happening. I was trying a lot and reading through the Documentation but dindnt find anything so thanks a lot.
can you past your complete code and the error message.
Thank you for sharing the logs:
CRITICAL:absl:Image /t/track/outdoor/sun_aophkoiosslinihb.jpg could not be decoded by Tensorflow
It seems to be raised by:
https://github.com/tensorflow/datasets/blob/721b0d8ff937dd6cf97604e1447acc84393290a9/tensorflow_datasets/image_classification/sun.py#L101
I'm not sure why this is raised only now, but not before. Does this means new TF version are unable to decode images which were previously correctly decoded with tf.image.decode_image
? Is it system dependent (only on windows) ?
@rohit11544 I did Post my Code above. But here is the line, that I used in my Code. And I run the Code with only this line in it. Line: (train_ds, val_ds), info = tfds.load("sun397", split=["train[:55900]", "validation"], as_supervised=True, with_info=True)
The Error you asked is above as well.
@FPGSchiba The issue is with the split argument, the argument takes only "train" , "test" but not "validation" you need to replace the "validation" with "test" then it will work. I have run the same code on mnist dataset see this screenshots.
1) split=["train[:55900]", "test"]
2) split=["train[:55900]", "validation"]
@rohit11544 I don't think this works for SUN397 in your example you are loading mnist and not SUN397. And my Error has nothing to do with Splitting did you even read my first post?
@FPGSchiba Firstly I am sorry to run your code on mnist dataset rather than SUN397. I completely agree with your point but the split argument doesn't depend on the dataset right? so I suggested you to replace the "validation" with "test" and try, if it works then its fine else then we can try to find another solution.
@rohit11544 sorry for my harsh words, but i am sure that the Splitting is individual for every Dataset, but I tried it either way and it didn't work. The log output was the same as in the first post.
That's the code I used to run:
(train_ds, val_ds), info = tfds.load("sun397", split=["train", "test"], as_supervised=True, with_info=True)
Thanks for your help
@FPGSchiba Hey that's completely fine. I don't know that Splitting is individual for every Dataset so I suggested that way. Ok we will find another way.
@rohit11544 @FPGSchiba Is there any update on this or have you found a solution? I am getting the same error on ubuntu with tfds 4.2.0 and tensorflow 2.4.1 .
This is the code I use to load the dataset:
dataset_builder = tfds.builder("sun397/tfds:4.*.*", data_dir=data_dir)
dataset_builder.download_and_prepare()
I will try this one thanks @simonre.
Sorry for closing so fast. I got the same error, if it works on ubuntu, perhaps it is system dependent?
It does not work for me on ubuntu. I get the same error as you.
I have the same problem on ubuntu 18.04:
CRITICAL:absl:Image /t/track/outdoor/sun_aophkoiosslinihb.jpg could not be decoded by Tensorflow
Traceback (most recent call last):
File "run_meta_learn.py", line 407, in
I am getting this error:
File "/opt/conda/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 2045, in init self._traceback = tf_stack.extract_stack_for_node(self._c_op) . Image /t/track/outdoor/sun_aophkoiosslinihb.jpg could not be decoded by Tensorflow.
Any idea if it will be fixed?
I didn't try for a while... lets wait if something will be made. I think I will try in the future again.
if you modify the sun.py file from tensorflow in your environment to skip that file you can download the dataset (does remove 1 image out of 72,000) but probably shouldn't be a big deal in most situations
Thanks for the idea. But could that not be fixed with a PR?
just add this file in ignore images list in sun.py
_SUN397_IGNORE_IMAGES = [ "SUN397/c/church/outdoor/sun_bhenjvsvrtumjuri.jpg", "SUN397/t/track/outdoor/sun_aophkoiosslinihb.jpg" ]
Merged PR https://github.com/tensorflow/datasets/pull/4955, this issue should be resolved now.