gpt-4chan-public icon indicating copy to clipboard operation
gpt-4chan-public copied to clipboard

Model has been removed

Open jLynx opened this issue 2 years ago • 9 comments

The model that you linked to here https://huggingface.co/ykilcher/gpt-4chan has been removed. Any chance you could reupload somewhere else?

jLynx avatar Jun 10 '22 08:06 jLynx

I'm aware. I'm still trying to resolve this with HF, so maybe give it a bit of time :)

yk avatar Jun 10 '22 15:06 yk

I'm interested in doing some research with similar models in the near future, have you considered making the model available via torrent? I'd happily seed the model indefinitely ;)

Captain-Wet-Beard avatar Jun 11 '22 14:06 Captain-Wet-Beard

Torrents. The only way to permanently put something on the internet. You are supposed to be a programmer. 🤣🤣🤣

You can search through DHT too! Here you go, a torrent. https://btdig.com/bf7434b0050c76127cff1296ab4e6b52ea2c4ac0/pytorch_model.bin

https://archive.org/details/gpt4chan_model

ValZapod avatar Jun 13 '22 06:06 ValZapod

Idk why I didn't think to check the archive, thanks for the link. Like I said, I'll seed it long term for those who are interested in the model.

Captain-Wet-Beard avatar Jun 13 '22 15:06 Captain-Wet-Beard

Note: I cloned the repository (not this one, the model repo) on my GitHub account, and I uploaded the model itself on the Internet Archive (32bit version).

Someone on Reddit reached out to me and asked for the 16bit version. I didn't have it, but I found it on 4chan., so I told the Reddit user. They uploaded it too to the Internet Archive and started seeding it.

  • Model repo: https://github.com/Aspie96/gpt-4chan-model (there are 2 branches).
  • Model 32 bit: https://archive.org/details/gpt4chan_model
  • Model 16 bit: https://archive.org/details/gpt4chan_model_float16

Admittedly, I am not seeding: I don't use the model personally and I don't have the resources to. However, I invite anyone who wants and can to seed (for this, I thank @Captain-Wet-Beard and all others that are seeding the model).

It is extremely important, for all information, that there isn't a potential censor, or a single point of failure. While trying to convince HuggingFace to allow the model is good, we should not depend on them for our usage of the model.

@yk, I noticed that in https://huggingface.co/ykilcher/gpt-4chan/discussions/4 you asked if you can advertise an alternative download source on HF. Rather than doing it on HF (it's not unreasonable for them to forbid this), I suggest that you make a whole video advertising the new download source. This is something HF cannot prevent in any way, and might actually get (and probably will get) even more attention than a link there. Also, if you plan to use BiTorrent to distribute this, I invite you to use the same files, which people are already seeding.

I think an open source licensed notebook which clones the repo trough Git, downloads the model trough torrent and sets the model running would be helpful.

Aspie96 avatar Jun 16 '22 20:06 Aspie96

Note: for those who do not trust me, or random users on Reddit and 4chan (as you shouldn't), Yannic himself published the MD5 hashes of the model here: https://huggingface.co/ykilcher/gpt-4chan and on this comment: https://huggingface.co/ykilcher/gpt-4chan/discussions/4#62a8f9b39e44ab41605b70a3

This is quite smart, because the repo only contains SHA256 hashes, but the Internet Archive uses MD5 in the files.xml file.

So, even without downloading anything, you can go check the hashes on the Internet Archive. I can confirm they are correct for both models.

Aspie96 avatar Jun 16 '22 20:06 Aspie96

Torrents. The only way to permanently put something on the internet.

Using download managers for downloads via HTTP protocol is a little more complicated than using P2P. By complexity I mean that some servers do not allow you to request data starting from an arbitrary position. This can happen due to a variety of reasons, but the most basic ones are specific configuration (e.g. because of own policy for downloading files) and outdated server software.

I think @Captain-Wet-Beard asked for a torrent primarily because of convenience.

ghost avatar Jul 02 '22 16:07 ghost

By complexity I mean that some servers do not allow you to request data starting from an arbitrary position.

Archive is a real beauty about it, it really does not have the ability to show the whole size of a file and has no ability to fix broken packets, so if it breaks, it is over. Google drive too, sometimes, and Mega that asks to download into browser super cash is horrible! Chrome only recently added multi downloads, when you download through multiple connections and it still cannot restart downloading not from the start, only pause is supported!! Epic and even Steam (?) moved to Bittorent!

ValZapod avatar Sep 16 '22 06:09 ValZapod

Hi there, where is the conflg.json for the backup stored on archive?

cynthiaio avatar Oct 08 '23 19:10 cynthiaio

@cynthiaio see: https://github.com/Aspie96/gpt-4chan-model

Aspie96 avatar Oct 08 '23 22:10 Aspie96

thank you for the torrents! <3

Rushmore75 avatar Oct 10 '23 21:10 Rushmore75