keras-nlp icon indicating copy to clipboard operation
keras-nlp copied to clipboard

Add GPT-2 Checkpoints

Open abheesht17 opened this issue 3 years ago • 8 comments

abheesht17 avatar Sep 17 '22 17:09 abheesht17

Thanks! This looks good to me at a high level. I see a lot of places where this and https://github.com/keras-team/keras-nlp/pull/361 are treading the same ground, so maybe we should merge this after the earlier PR lands?

Also it is nice how the ID approach in https://github.com/keras-team/keras-nlp/pull/361 eloquently solves the issue of what to call these checkpoints. Instead of "webtext" or "yes_please", we "gpt2_base". Which reads great to me.

mattdangerw avatar Sep 20 '22 01:09 mattdangerw

Also just a note, after landing https://github.com/keras-team/keras-nlp/pull/361, I think we should move the GCP buckets directories so that they live in keras-nlp/models/gpt2_base, and not keras-nlp/models/gpt2_base_webtext.

Edit, just noted @abheesht17 already has a TODO for this :)

mattdangerw avatar Sep 20 '22 18:09 mattdangerw

Great work! Mostly minor changes to sync with the final version of #361.

Not sure about our initializer strategy however. Will need to check up on that.

jbischof avatar Sep 20 '22 23:09 jbischof

Thanks for adding the defaults! Added some small comment about adding test coverage.

jbischof avatar Sep 21 '22 17:09 jbischof

@abheesht17 I have moved the gpt2 checkpoints:

gs://keras-nlp/models/gpt2_base/:
gs://keras-nlp/models/gpt2_base/merges.txt
gs://keras-nlp/models/gpt2_base/model.h5
gs://keras-nlp/models/gpt2_base/vocab.json

gs://keras-nlp/models/gpt2_extra_large/:
gs://keras-nlp/models/gpt2_extra_large/merges.txt
gs://keras-nlp/models/gpt2_extra_large/model.h5
gs://keras-nlp/models/gpt2_extra_large/vocab.json

gs://keras-nlp/models/gpt2_large/:
gs://keras-nlp/models/gpt2_large/merges.txt
gs://keras-nlp/models/gpt2_large/model.h5
gs://keras-nlp/models/gpt2_large/vocab.json

gs://keras-nlp/models/gpt2_medium/:
gs://keras-nlp/models/gpt2_medium/merges.txt
gs://keras-nlp/models/gpt2_medium/model.h5
gs://keras-nlp/models/gpt2_medium/vocab.json

jbischof avatar Sep 21 '22 17:09 jbischof

@jbischof, thanks! I have changed the URLs, and added the UT for defaults.

Edit: The URLs don't seem to work.

abheesht17 avatar Sep 22 '22 00:09 abheesht17

Fixed permissions on URLs yesterday @abheesht17

jbischof avatar Sep 22 '22 20:09 jbischof

Fixed permissions on URLs yesterday @abheesht17

Thanks, @jbischof! I have run network tests locally and all of them pass!

abheesht17 avatar Sep 22 '22 21:09 abheesht17