Dataset plant_leaves loading fails with a HTTP 404 error
Short description
When trying to load the plant_leaves dataset, downloading the files fails with a 404 HTTP error on one of the images (https://data.mendeley.com/v1/datasets/hb74ynkjcn/1/files/7f977a2c-aa75-4610-a6f6-84c8738d8c79/0001_0001.JPG)
Environment information
-
Operating System: Google Colab
-
Python version: 3.8
-
tensorflow-datasets/tfds-nightlyversion: tfds-nightly 4.8.2+nightly -
tensorflow/tf-nightlyversion: tensorflow 2.9.2 -
Does the issue still exists with the last
tfds-nightlypackage (pip install --upgrade tfds-nightly) ? Yes
Reproduction instructions
!pip install -q tfds-nightly tensorflow matplotlib
import tensorflow as tf
import tensorflow_datasets as tfds
# Construct a tf.data.Dataset
ds = tfds.load('plant_leaves', split='train', shuffle_files=True)
Link to logs https://gist.github.com/men1n2/a92ed383191d23c9c11e466a3f0f0a2f
Expected behavior The dataset to be loaded without errors.
Additional context
URLs to images for this dataset mentioned in https://github.com/tensorflow/datasets/blob/master/tensorflow_datasets/image_classification/plant_leaves_urls.txt seems to be outdated. I think fixing these URLs might solve this error.
PR #4758 was submitted for the same issue.