gpt-2 Release The Full Model!

trafficstars

I understand your concerns but I still think it's better to release the full model now and let people poke at it's abilities and discover potential issues quicker.

Feb 15 '19 02:02 superjayman

Thanks for raising the issue! People have expressed similar sentiment internally and we take that argument seriously. Would love to see people start investigations with the small model and we will be re-evaluating our release of the larger models in the future.

Feb 15 '19 02:02 WuTheFWasThat

Actually, it seems more correct to leave this issue open :)

Feb 15 '19 02:02 WuTheFWasThat

plz release the models that support more languages

Feb 15 '19 03:02 yzho0907

Better safe, then sorry. If the experts want caution, the least we can do is respect their judgement

Feb 15 '19 03:02 gabefair

https://blog.openai.com/better-language-models/:

We will further publicly discuss this [model release] strategy in six months. If you’d like to discuss large language models and their implications, please email us at: [email protected].

Feb 15 '19 05:02 Franck-Dernoncourt

Will you be releasing the English speaking unicorns to the public?

Feb 15 '19 05:02 roschler

We don't know enough about unicorns to say they aren't dangerous. We will release a unicorn fetus for the scientific community to study for now, and re-evaluate later.

Feb 15 '19 05:02 WuTheFWasThat

It's a pity, let me remind you the name 'OpenAI' well not so open is it?

Feb 15 '19 06:02 superjayman

@superjayman i agreed, open and sharing is the core of open innovations and release all does not do any harm but improves it much more quicker.

Feb 15 '19 07:02 yzho0907

Thanks for exercising caution and pointing out that you did. Seems cool.

Curious about focus. Can haz enlightenburger?

Feb 15 '19 08:02 bnealey

Can you train this on a list of translated sentences (from english to japanese for example) and use it as an AI language translator?

Feb 15 '19 08:02 marca-development

Isn't this exactly what OpenAI was not supposed to be about? Being closed source and up to the whim of PR teams and private incentives of a small number of people? I've made my own natural language generator that only says things that makes sense (and ask itself questions based on it's own answers) and it does the same thing. I also got it to believe in a god and break down from anxiety over "should" questions in a really insightful way that would be helpful to alot of people. Turns out if you don't internally ask or answer "should" questions at all it's really hard to get into a social anxiety loop, and you can see it all break down like "what if they think [this] > what should I think about the human thinking [this] about me? > idk > I haven't talked to the human because I was thinking about this > what if they think [this] now is this good or bad? Should I care?" Wouldn't have known that if I stopped working on it like you guys did. It's like you're trying to answer the trolley problem like it's some kind of moral dilemma. Almost none of them are. It's an engineering problem. The velocity and mass of the trolley is not unknown, and there's 9 different ways you can stop the train using physics, but if you stand there thinking about whether you should pull it or not, you're fucked either way.

Even for fake news, this would be a good tool. If you're going to believe something just because it's possible to say it using the english language, you're a fucking idiot. Check the source. Check peer reviews. If anything, a random blog post generating fake news like this will point out how stupid people are for believing it. And it's much easier to do that than it is with mainstream media.

Feb 15 '19 09:02 Tophness

Maybe, as in POI, they are still teaching their child to be kind?

Feb 15 '19 10:02 fallenartist

I respect the decision, "with great power comes great responsibility". But I suggest the releasing of the 345M model. The reasons come in two folds: it is much better than the 117M but not nearly as good as the 1.5B model; it has a similar amount of parameters to the BERT-large-uncased model which makes it a good candidate to be compared with.

Feb 15 '19 10:02 chenyangh

Maybe the reason this git exists is the same that the team should release all and be 'open' but they are the ones who make the decision anyway. I just hope that it would be good for both of the team and us.

Feb 15 '19 11:02 yzho0907

There are a lot of things could be used in wrong direction in bad hands. But just imagine the positive feedback of your technology. Your fears are inevitably in anyway.

Feb 15 '19 11:02 max-frai

https://news.ycombinator.com/item?id=19168712 please read the horrendous comments. well dont read all of them it gets depressing. but jesus christ. you are OPEN ai. do we really have to spell that out for you. O P E N

Feb 15 '19 12:02 dackdel

Help I need this to help write my 9th grade essays

Feb 15 '19 17:02 sciencemanx

Okay. A nonprofit writes an interesting product, which a for - profit could probably recreate and patent. Or am I wrong there?

Why release a teaser only? A shrunk, non-trainable thing, just there to show off?

I admit it. I am suitably impressed, but also seriously annoyed

"open" in name only.

Feb 15 '19 19:02 jensstark

It's worth remembering that openai has in fact been pretty good about releasing the code to their stuff. They've been much more open than deepmind, which I think was the concern that lead to their creation. This seems comparable to responsible disclosure in software security - when an open source group finds a bug in widely-deployed, un-updateable software, eg something used in routers or etc, that could be used for large scale spamming, they'll start work on ways to mitigate it before announcing what the vulnerability is. If someone who works for a FOSS company were to find a really efficient design for building software to do denial-of-service attacks, it'd be a similar story - look for DOS mitigations first.

I'd say it's a comparable situation to the latter: OpenAI is worried that they've built a generally useful tool that could make a category of DOS attack much much worse, and they don't currently see anything preventing that from happening.

I've been thinking that it might be good to get something like GPT2 1.5B into the hands of google and facebook and a few other major forum operators, maybe reddit, under a contract to use it for improving moderation. (edit to clarify: just giving them early access so they can use it to build safeguards against things like it.) It seems like GPT2 is good enough to take a serious crack at implementing xkcd's suggestion from nearly ten years ago: who cares if it's a human or a machine? the real question is whether it's malicious content. That proposal as it is wouldn't help so much with fake news, because people lying is a different problem than people doing a denial-of-service via vitriol, but it would make a big impact on a major source of the problem. Or perhaps the AI teams with enough resources could get together and talk about how to use this level of NLP performance to build other types of linguistic DOS mitigations.

I am, for my own curiosity, quite irritated that it's not being released, but I agree that the performance is reasonably worthy of the concern. I just don't see not releasing it as being that useful unless the time until someone replicates it is spent making mitigations to the world that will be created when someone else has a copy.

@WuTheFWasThat I do think yall could probably release the training code safely, though. Seems to me that it's the dataset and something like $40k worth of compute of trained model that are the real interesting thing here.

Feb 16 '19 03:02 lahwran

I think you guys are scared of nothing, release the whole model please.

It's not like you have 20,000 people who pulled this repo - so it's really hard to use this 'maliciously'

Besides there are other alternatives that have produced similar (better) results than what this is - cakechat for example when fed the Reddit corpus (the same one that spooked you) you'll get some crazy things. But just like when you tell a young kid 'its just a movie' or 'just a game' - this is just a computer program. It's not some sci fi novel come to life.

Feb 16 '19 05:02 4R7I5T

We want the red pill!

Feb 16 '19 13:02 schwittlick

The resources needed to train the full model are beyond the average person and small companies which could use this for potentially very interesting non-malicious applications. However large organizations and state actors that are most likely to use this for malicious purposes can and typically do already have easy access to the resources needed to replicate the full model.

Therefore by not releasing the full model you are ensuring that this sort of AI tech remains in the hands of powerful organizations and state actors that are most likely to misuse it while at the same time unintentionally tricking the general public to think this tech is not "really" available yet. Releasing the full model & leveling the playing field is the right thing to do here. Please release the full model.

Feb 16 '19 14:02 iurimatias

So How Many Other Innovations Are You Guys Going To Keep Closed? Say next week you have an even bigger break-thru , will the full model now be superseded and seem less harmful and you may decide to release it?.. see it does not make sense, how do you put a limit on unknown capabilities?

Feb 16 '19 14:02 superjayman

Everyone here could benefit from Nick Bostrom's The Unfinished Fable of the Sparrows as presented in his 2014 book about this subject, Superintelligence: Paths, Dangers, Strategies. Dr. Bostrom is Director of the Future of Humanity Institute at the University of Oxford. https://youtu.be/7rRJ9Ep1Wzs

Feb 16 '19 18:02 gabefair

I think at least you can open soon a challenge like i.e. Google's fake audio detection challenge and then release the full model after the community has a detection baseline.

Feb 16 '19 23:02 bhack

Well.. Here is what I predict will happen very soon and why. The thing your software can do will be replicated and released for the whole world within months, maybe even weeks. It will grow just like deep fakes and college students will be using it to write their finals in the fall. The media has blasted the fact that you have a new toy and you refuse to share. Now that people know what type of coverage they can expect for a full released version they will not care about consequences. They will get the publicity and the feedback they need to make it even better. From that point forward all the phone apps, diy personal assistance devices, and automated blog post generators will say powered by [insert company].. Yes that same company name will be associated with the fake Amazon reviews but when it comes to business and economics bad pablicity is still pablicity. The up side is that instead of "encouraging" the government and other agencies to address these issues they will be forced to. This train is coming and I am afraid that you putting pennies on the track is not going to stop it. Heck, my 15 year old uses python in ways it would never have occurred to me. Honestly, I personally couldn't pull this off without a team but I am sure there are investors out there that see dollar signs in being first. I am sure you have gotten some very interesting e-mails reinforcing that sentiment. If I were in your position I would reconcider my decision to release the full project or at least set a date. People tend to be more productive when they are up against the clock. 90 days would certainly be enough time for these big companies to prepare and more than enough time for governments to educate their patrons about the swarm of "fake news" headed their way. I read that last sentence and spit my drink out hahahaha.. Anyway, read my post in your meeting Monday morning and reevaluate your decision. Great job by the way. It must be awesome to see the results first hand. //This post was written by a human.//

Feb 17 '19 00:02 freecode-ai

Release the kraken!

Feb 17 '19 01:02 joemillervi

Well.. Here is what I predict will happen very soon and why. The thing your software can do will be replicated and released for the whole world within months, maybe even weeks. It will grow just like deep fakes and college students will be using it to write their finals in the fall. The media has blasted the fact that you have a new toy and you refuse to share. Now that people know what type of coverage they can expect for a full released version they will not care about consequences. They will get the publicity and the feedback they need to make it even better. From that point forward all the phone apps, diy personal assistance devices, and automated blog post generators will say powered by [insert company].. Yes that same company name will be associated with the fake Amazon reviews but when it comes to business and economics bad pablicity is still pablicity. The up side is that instead of "encouraging" the government and other agencies to address these issues they will be forced to. This train is coming and I am afraid that you putting pennies on the track is not going to stop it. Heck, my 15 year old uses python in ways it would never have occurred to me. Honestly, I personally couldn't pull this off without a team but I am sure there are investors out there that see dollar signs in being first. I am sure you have gotten some very interesting e-mails reinforcing that sentiment. If I were in your position I would reconcider my decision to release the full project or at least set a date. People tend to be more productive when they are up against the clock. 90 days would certainly be enough time for these big companies to prepare and more than enough time for governments to educate their patrons about the swarm of "fake news" headed their way. I read that last sentence and spit my drink out hahahaha.. Anyway, read my post in your meeting Monday morning and reevaluate your decision. Great job by the way. It must be awesome to see the results first hand. //This post was written by a human.//

100 Percent, Agreed! It will not be long until this is replicated anyway.

Feb 17 '19 12:02 superjayman

Is it sad that I wrote a program to check each https://storage.googleapis.com/gpt-2/models/* directory from 1 to 999 for the letters M, G, & T? The only thing it found was 117M ;)

Feb 18 '19 20:02 clintonm9

gpt-2 gpt-2 copied to clipboard

Release The Full Model!

gpt-2
gpt-2 copied to clipboard