keras-nlp icon indicating copy to clipboard operation
keras-nlp copied to clipboard

Add checkpoints for fine-tuned models

Open jbischof opened this issue 3 years ago • 2 comments

We have numerous checkpoints for text encoders but there's a lot of value in offering ready-to-go fine tuned models as well.

Thoughts:

  • Let's start with fine-tuning on SST using a BertBase backbone to minimize the amount of novel code.
  • Add a weights argument to BertClassifier and classifier_checkpoints struct to bert_tasks.py
  • Each classifier_checkpoints entry will need to specify num_classes.
    • This is a baby step towards configs but I cannot imagine writing config-in-code class like BertClassifierWith4Classes(). Other tasks might need several params in the checkpoint struct.
  • We need to choose if we want a default for weights or backbone. To me backbone is more natural than giving everyone sentiment analysis by default.
  • Add example usage in the BertClassifier docstring.
  • While there are plenty of checkpoints to convert around the web, I wonder if we should train our own. This would be a good e2e test of our current UX and code and good material for an eventual sentiment analysis demo.

jbischof avatar Sep 21 '22 17:09 jbischof

This all sounds good to me!

SST as the first offering sounds great.

Each classifier_checkpoints entry will need to specify num_classes.

This sounds good to me to! I think the "Keras style" worry was more specifically about config objects exposed in API symbols, not about tracking metadata for checkpoints internally.

It seems like for now at least, we have the pattern where the weights/vocabulary arguments will imply a value for other args. Backbone weights sets vocabulary_size, classifier weights sets num_classes, preprocessing vocabulary sets lowercase. I know that is probably not your ideal state, but at least we are consistent in the "mental model" we are giving to users. And can continue to discuss a way to avoid or mitigate the presence of arg-arg interactions in our signatures.

I wonder if we should train our own.

sgtm! The most natural place for this seems like bert/examples for this I think? This is explicitly not library code, that will be used to generate checkpoints used in library code.

Some planning might be in order here broadly for the examples code. I.e.

  • We should probably rewrite the bert pretraining and glue fine tuning scripts so that checkpoints are the defacto currency passed between scripts.
  • We should probably make a standalone bert classification fine tuning script for uses like training our sst model, but should still figure out how that interacts with the glue fine tuning offering. Do we have two separate scripts--one that does sst inside glue and one that fine tunes for classification on a dataset separately? Or is the glue script a meta script that invoked our classifier script multiple times.

mattdangerw avatar Sep 21 '22 18:09 mattdangerw

Each classifier_checkpoints entry will need to specify num_classes.

Just to clarify - do we want to have this information in the saved file, or a config? My preference is to go with a config, e.g.:

"sst": {
  "num_classes": 2
  "weights_url": .....
}

We need to choose if we want a default for weights or backbone.

We may also go with no defaults? We can rely on users to choose their desired arch.

I wonder if we should train our own.

I vote for finetuning on our own side rather than load checkpoints. Doing finetuning lets us use our pretrained model offering as a real user, so we can find out the conflicts.

chenmoneygithub avatar Sep 21 '22 20:09 chenmoneygithub

@jbischof should we mark this as done?

mattdangerw avatar Nov 30 '22 21:11 mattdangerw

Fixed by #494

jbischof avatar Nov 30 '22 21:11 jbischof