datasets icon indicating copy to clipboard operation
datasets copied to clipboard

Dataset plant_leaves loading fails with a HTTP 404 error

Open amine-lah opened this issue 2 years ago • 2 comments

Short description When trying to load the plant_leaves dataset, downloading the files fails with a 404 HTTP error on one of the images (https://data.mendeley.com/v1/datasets/hb74ynkjcn/1/files/7f977a2c-aa75-4610-a6f6-84c8738d8c79/0001_0001.JPG)

Environment information

  • Operating System: Google Colab

  • Python version: 3.8

  • tensorflow-datasets/tfds-nightly version: tfds-nightly 4.8.2+nightly

  • tensorflow/tf-nightly version: tensorflow 2.9.2

  • Does the issue still exists with the last tfds-nightly package (pip install --upgrade tfds-nightly) ? Yes

Reproduction instructions

!pip install -q tfds-nightly tensorflow matplotlib
import tensorflow as tf
import tensorflow_datasets as tfds
# Construct a tf.data.Dataset
ds = tfds.load('plant_leaves', split='train', shuffle_files=True)

Link to logs https://gist.github.com/men1n2/a92ed383191d23c9c11e466a3f0f0a2f

Expected behavior The dataset to be loaded without errors.

Additional context

amine-lah avatar Feb 03 '23 23:02 amine-lah

URLs to images for this dataset mentioned in https://github.com/tensorflow/datasets/blob/master/tensorflow_datasets/image_classification/plant_leaves_urls.txt seems to be outdated. I think fixing these URLs might solve this error.

OmkarBorhade98 avatar May 16 '23 17:05 OmkarBorhade98

PR #4758 was submitted for the same issue.

OmkarBorhade98 avatar May 25 '23 16:05 OmkarBorhade98