keras-nlp
keras-nlp copied to clipboard
Upload Model to Kaggle
Implement upload_preset() to allow users to upload Model presets to Kaggle.
From what I understand from this PR and https://github.com/keras-team/keras-nlp/pull/1510#discussion_r1523759400, the goal is to be able to do something like this, right?
tokenizer.save_to_preset(dir)
backbone.save_to_preset(dir)
upload_preset("kaggle://user/model", dir)
Given that load_from_preset is able to load from a local dir or a kaggle uri, wouldn't it be nice to also allow both behaviors in save_to_preset? Something like this:
# Case 1.: save to local directory + upload it
tokenizer.save_to_preset(dir)
backbone.save_to_preset(dir)
upload_preset("kaggle://user/model", dir)
# Case 2.: upload directly (i.e. not saved locally or only to a tmp dir)
tokenizer.save_to_preset("kaggle://user/model")
backbone.save_to_preset("kaggle://user/model")
My suggestion is simply to have a "if starts with prefix, then save to tmp dir + upload directly" in save_to_preset. In any case, I'm fine with any solution. From an huggingface_hub point of view it'll be straightforward to implement (similar to the kagglehub.model_upload(kaggle_handle, preset) line).
Thanks for reviewing and sharing your feedback, @Wauplin!
Right! Case 1 is how we're planning to implement the model upload: save to a local dir and then upload!
I think uploading directly (case 2) is a nice design. However, kagglehub has immutable model versions so in case 2, when the tokenizer is uploaded, it creates version X of the model and when backbone is uploaded later, it creates version X+1 of the model.
We need to have all the model components saved before uploading.
I think uploading directly (case 2) is a nice design. However, kagglehub has immutable model versions so in case 2, when the tokenizer is uploaded, it creates version X of the model and when backbone is uploaded later, it creates version X+1 of the model.
Understood! Then leaving it as it is is fine I guess :) Thanks for the explanation!
Let me know once this is merged and I can contribute the hf:// integration right after.
It's been helpful for me to list out some design principals we could follow here:
- Idempotency. Save an object into the preset, get the same object out.
- Able to break down to lower-level core Keras APIs. E.g. load a backbone with
keras.saving.deserialize_keras_object(file)+model.load_weights(file).
I think we have those, except for the preprocessor issue above.
Also a question, if a user is saving a task preset, and the preprocessor is None (either it's just a custom function not a layer, or unattached to the task). What do we do? If we want idempotent saving, you save a task with no preprocessing, you will load a task with no preprocessing. But we might want to indicate to users this might hurt the usability of their preset. A warning?
Thanks for sharing your thoughts, Matt! As we discussed in the meeting, we want to be able to support partial model upload, e.g. preprocessor without backbone or backbone without tokenizer. So we just need to build safeguards around it and make sure the user knows what they're doing.
@fchollet what do you think about where we expose our new symbol?
keras_nlp.upload_preset() vs keras_nlp.utils.upload_preset() vs keras_nlp.models.upload_preset()?
keras_nlp.upload_preset() vs keras_nlp.utils.upload_preset() vs keras_nlp.models.upload_preset()?
Is a preset always model-related? If not, I'd recommend keras_nlp.upload_preset().