keras-nlp icon indicating copy to clipboard operation
keras-nlp copied to clipboard

Retrain `bert_tiny_uncased_en_sst2` BertClassifier to reflect the dropout change

Open chenmoneygithub opened this issue 3 years ago • 8 comments

We added dropout layer to keras_nlp.models.BertClassifier, so we need to update the presets accordingly.

chenmoneygithub avatar Dec 06 '22 21:12 chenmoneygithub

@jbischof Jon I believe you still have the training script?

chenmoneygithub avatar Dec 06 '22 21:12 chenmoneygithub

Yes it's in our repo (link).

My goal was having a task preset for primarily API development and tutorial writing. It's OK if these models get better over time.

jbischof avatar Dec 06 '22 21:12 jbischof

Hi @chenmoneygithub I would like to solve this issue. Could you please guide me little bit? As I understood I need to train BertClassifier 2 times with different dropout_probs to reflect how this change is changing the final trained training...is it right?

cc : @jbischof

susnato avatar Feb 23 '23 04:02 susnato

@susnato Thanks for your interest!

To clarify - we finetuned BERT earlier on SST2 to make bert_tiny_uncased_en_sst2, however, dating back to that time our BertClassifier did not have the dropout layer. So what we need to do now is to finetune BertClassifier again on SST2. Your work will include:

  1. Write a colab that finetunes keras_nlp.models.BertClassifier on SST2.
  2. report the evaluation score on validation set.
  3. share the colab with us by opening a PR to

Then we will run your colab to generate the checkpoint and upload to our Google cloud storage. Since no code will be checked in, we will explicitly credit you in the code and our documentation on keras-io.

chenmoneygithub avatar Feb 23 '23 05:02 chenmoneygithub

@chenmoneygithub Thanks for replying! I will trained it for 2 epochs. Since for 5 epochs it was taking a lot of time.

susnato avatar Feb 23 '23 05:02 susnato

Hey i would like to take this up

jayam30 avatar Feb 23 '23 15:02 jayam30

Hi @jayam30 as you can see I have already submitted the PR regarding the issue and we are working on this. You can choose other the issues that need immediate fix from this list

susnato avatar Feb 23 '23 15:02 susnato

This issue is stale because it has been open for 180 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions[bot] avatar Mar 12 '24 01:03 github-actions[bot]