GloVe
GloVe copied to clipboard
Download links have been timing out for a few days
Hey,
I don't know if this is the correct place for this, if so please feel free to close the issue. For the last couple of days I have been struggling to download the pre-trained word-vectors from the stanford.edu webpage.
Neither of the files listed here https://nlp.stanford.edu/projects/glove/ can be downloaded. https://nlp.stanford.edu/data/glove.6B.zip https://nlp.stanford.edu/data/glove.840B.300d.zip ...
Are others facing the same issue? And yes, I am aware that there is a mirror on huggingfaces.
It looks like some browsers are blocking the download in an excess of security caution. We're going to look to see if there's an easy way to get around these download limitations, but in the meantime, you should be able to download by pasting the URL directly into your address bar
https://nlp.stanford.edu/data/glove.6B.zip
on Chrome I have to do that twice before it actually does anything
On Tue, Jun 21, 2022 at 8:00 AM Michael Plainer @.***> wrote:
Hey,
I don't know if this is the correct place for this, if so please feel free to close the issue. For the last couple of days I have been struggling to download the pre-trained word-vectors from the stanford.edu webpage.
Neither of the files listed here https://nlp.stanford.edu/projects/glove/ can be downloaded. https://nlp.stanford.edu/data/glove.6B.zip https://nlp.stanford.edu/data/glove.840B.300d.zip ...
Are others facing the same issue?
— Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/GloVe/issues/206, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AYWKAKI7RTSBC6M6VOYTVQHKJ5ANCNFSM5ZMVPK5Q . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I'm having this issue as well. The url provided above doesn't seem to resolve this problem. Using wget
also results in a timeout error.
Situation is different - now the outage is because of a campus wide power outage. Will hopefully be back soonish.
i'm having the same problem. So we just need to wait?
@AsmaAsmaGharbi in the readme of this repo there are mirrors from huggingface. You can use those. If the error occurs for example in a library like pytorch, there is probably a way to change the download url or move the file into the specified cache folder.
I'm also having the same problem, the link is not working. Connection is getting failed using wget and direct link also not working , giving time out error.
@AsmaAsmaGharbi in the readme of this repo there are mirrors from huggingface. You can use those. If the error occurs for example in a library like pytorch, there is probably a way to change the download url or move the file into the specified cache folder.
Resolved
@Sadiii The link text in the readme is a little confusing.
The [mirror]
link points to the stanford download page while the [glove.42B.300d.zip]
points to the huggingface mirrors described by @plainerman.
TL;DR, use these links and you should be able to download the dataset from huggingface:
- Common Crawl (42B tokens, 1.9M vocab, uncased, 300d vectors, 1.75 GB download)
- Common Crawl (840B tokens, 2.2M vocab, cased, 300d vectors, 2.03 GB download)
- Wikipedia 2014 + Gigaword 5 (6B tokens, 400K vocab, uncased, 300d vectors, 822 MB download)
- Twitter (2B tweets, 27B tokens, 1.2M vocab, uncased, 200d vectors, 1.42 GB download)
Another idea would be to put the models on AcademicTorrents - I could help seed them.
Wouldn't say no, especially since the glove models don't change too often, but have you been noticing problems lately? Not sure that we want to add more work if the problem is already solved.
On Wed, Oct 5, 2022 at 9:53 AM Hendursaga @.***> wrote:
Another idea would be to put the models on AcademicTorrents https://academictorrents.com/ - I could help seed them.
— Reply to this email directly, view it on GitHub https://github.com/stanfordnlp/GloVe/issues/206#issuecomment-1268687480, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA2AYWNTXEQFEFI6U36YEA3WBWXBXANCNFSM5ZMVPK5Q . You are receiving this because you commented.Message ID: @.***>