datasets icon indicating copy to clipboard operation
datasets copied to clipboard

NonMatchingChecksumError while downloading 'multi_news' or 'cnn_dailymail' dataset

Open singhniraj08 opened this issue 1 year ago • 2 comments

Short description Description of the bug.

getting NonMatchingChecksumError while downloading multi_news or cnn_dailymail datasets.

Environment information

  • Operating System: : Colab

  • Python version: : 3.10

  • tensorflow-datasets/tfds-nightly version: tensorflow-datasets 4.9.4

  • tensorflow/tf-nightly version: tensorflow 2.15

  • Does the issue still exists with the last tfds-nightly package (pip install --upgrade tfds-nightly) ? Yes

Reproduction instructions

(https://colab.sandbox.google.com/gist/singhniraj08/9f80bc167706b9b351b75e003dcad39c/untitled2.ipynb)

If you share a colab, make sure to update the permissions to share it.

Link to logs

NonMatchingChecksumError: Artifact https://drive.google.com/uc?export=download&id=1vRY2wM6rlOZrf9exGTm5pXj5ExlVwJ0C, downloaded to /root/tensorflow_datasets/downloads/ucexport_download_id_1vRY2wM6rlOZrf9exGTm5pXj5OT0RBXCg5OWBrYMJXysF1hdrkZtPhK-7JWdYi2HrYYc.tmp.c134b8c8d86c4764bad073c9d79db385/download, has wrong checksum:

  • Expected: UrlInfo(size=245.06 MiB, checksum='64ae4d2483b248c9664b50bacfab6821f8a3e93f382c7587686fa4a127f77626', filename='multi-news-original-20190725T164630Z-001.zip')
  • Got: UrlInfo(size=2.40 KiB, checksum='d86ce49a2cafe0ed25eae0c9a5ed9abf8db1e34414e3acb667e316ad221c73c5', filename='download') To debug, see: https://www.tensorflow.org/datasets/overview#fixing_nonmatchingchecksumerror

Expected behavior What you expected to happen.

Dataset should download without any issues.

Additional context Add any other context about the problem here.

singhniraj08 avatar Jan 16 '24 06:01 singhniraj08

Hello @singhniraj08, This is an persisting problem in tfds (#3935) and there is no solutions till now, although you can bypass the issue by just downloading it manually.

Thank you,

83here avatar Jan 21 '24 13:01 83here

@singhniraj08 you can visit link- https://www.tensorflow.org/datasets/overview#fixing_nonmatchingchecksumerror. For correction and as per my knowledge this issue is not solved yet

Rahulraj0308 avatar Jan 28 '24 15:01 Rahulraj0308