wikitext-103-raw-v1.zip is not available on the amazonaws anymore
The raw dataset wikitext-103-raw-v1.zip is not available for download on amazonaws from what I see anywhere on the internet. I see other people complaining about this raw dataset disappear from internet on different repos and I don't know if this is permanent and/or new dataset should be used in this tutorial example.
https://github.com/huggingface/tokenizers/blob/f0c48bd89a442819b39605ca117ecabd293bfdd7/docs/source-doc-builder/quicktour.mdx?plain=1#L15
I receive the following error, when trying to wget the file:
wget --trust-server-names https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-103-raw-v1.zip --2024-11-18 15:28:28-- https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-103-raw-v1.zip Resolving s3.amazonaws.com (s3.amazonaws.com)... 54.231.172.216, 16.182.32.200, 52.217.112.120, ... Connecting to s3.amazonaws.com (s3.amazonaws.com)|54.231.172.216|:443... connected. HTTP request sent, awaiting response... 403 Forbidden 2024-11-18 15:28:28 ERROR 403: Forbidden.
The file is available at 'https://dax-cdn.cdn.appdomain.cloud/dax-wikitext-103/1.0.1/wikitext-103.tar.gz'
Do you want to open a pR to update?
The file is available at 'https://dax-cdn.cdn.appdomain.cloud/dax-wikitext-103/1.0.1/wikitext-103.tar.gz'
"We are having trouble finding that site"
I found it here .