keras-nlp
keras-nlp copied to clipboard
Add checkpoints for fine-tuned models
We have numerous checkpoints for text encoders but there's a lot of value in offering ready-to-go fine tuned models as well.
Thoughts:
- Let's start with fine-tuning on SST using a
BertBasebackbone to minimize the amount of novel code. - Add a
weightsargument toBertClassifierandclassifier_checkpointsstruct tobert_tasks.py - Each
classifier_checkpointsentry will need to specifynum_classes.- This is a baby step towards configs but I cannot imagine writing config-in-code class like
BertClassifierWith4Classes(). Other tasks might need several params in the checkpoint struct.
- This is a baby step towards configs but I cannot imagine writing config-in-code class like
- We need to choose if we want a default for
weightsorbackbone. To mebackboneis more natural than giving everyone sentiment analysis by default. - Add example usage in the
BertClassifierdocstring. - While there are plenty of checkpoints to convert around the web, I wonder if we should train our own. This would be a good e2e test of our current UX and code and good material for an eventual sentiment analysis demo.
This all sounds good to me!
SST as the first offering sounds great.
Each classifier_checkpoints entry will need to specify num_classes.
This sounds good to me to! I think the "Keras style" worry was more specifically about config objects exposed in API symbols, not about tracking metadata for checkpoints internally.
It seems like for now at least, we have the pattern where the weights/vocabulary arguments will imply a value for other args. Backbone weights sets vocabulary_size, classifier weights sets num_classes, preprocessing vocabulary sets lowercase. I know that is probably not your ideal state, but at least we are consistent in the "mental model" we are giving to users. And can continue to discuss a way to avoid or mitigate the presence of arg-arg interactions in our signatures.
I wonder if we should train our own.
sgtm! The most natural place for this seems like bert/examples for this I think? This is explicitly not library code, that will be used to generate checkpoints used in library code.
Some planning might be in order here broadly for the examples code. I.e.
- We should probably rewrite the bert pretraining and glue fine tuning scripts so that checkpoints are the defacto currency passed between scripts.
- We should probably make a standalone bert classification fine tuning script for uses like training our sst model, but should still figure out how that interacts with the glue fine tuning offering. Do we have two separate scripts--one that does sst inside glue and one that fine tunes for classification on a dataset separately? Or is the glue script a meta script that invoked our classifier script multiple times.
Each classifier_checkpoints entry will need to specify num_classes.
Just to clarify - do we want to have this information in the saved file, or a config? My preference is to go with a config, e.g.:
"sst": {
"num_classes": 2
"weights_url": .....
}
We need to choose if we want a default for weights or backbone.
We may also go with no defaults? We can rely on users to choose their desired arch.
I wonder if we should train our own.
I vote for finetuning on our own side rather than load checkpoints. Doing finetuning lets us use our pretrained model offering as a real user, so we can find out the conflicts.
@jbischof should we mark this as done?
Fixed by #494