keras-nlp icon indicating copy to clipboard operation
keras-nlp copied to clipboard

Convert our checkpoint colabs into runnable scripts

Open mattdangerw opened this issue 3 years ago • 23 comments

The colabs we currently have in tools/checkpoint_conversion are useful in that we don't loose the code for converting checkpoints. But they are fairly unwieldy. They must be pointed to a specific branch used for the model development, they are a ton of lines of code, and we need one for each model variant.

Instead we could try to write one script per model that handles checkpoint conversion (perhaps with a flag to control the model variant?). Potential file structure.

tools
└── checkpoint_conversion
    ├── README.md
    ├── convert_bert_weights.py
    ├── convert_gpt2_weights.py
    └── requirements.txt

This will make it much easier to re-run and test checkpoint conversion code in the future.

mattdangerw avatar Nov 18 '22 21:11 mattdangerw

I will take it

vulkomilev avatar Nov 28 '22 19:11 vulkomilev

@vulkomilev, please go ahead with writing the conversion script for BERT! You can follow the same template as RoBERTa's script: https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/convert_roberta_checkpoints.py.

abheesht17 avatar Dec 03 '22 03:12 abheesht17

okay

На сб, 3.12.2022 г. в 5:43 ч. Abheesht @.***> написа:

@vulkomilev https://github.com/vulkomilev, please go ahead with writing the conversion script for BERT! You can follow the same template as RoBERTa's script: https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/convert_roberta_checkpoints.py .

— Reply to this email directly, view it on GitHub https://github.com/keras-team/keras-nlp/issues/486#issuecomment-1336035269, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATA3WEM2BLPV5NLLIEGCEV3WLK6X5ANCNFSM6AAAAAASE4WQIU . You are receiving this because you were mentioned.Message ID: @.***>

vulkomilev avatar Dec 03 '22 17:12 vulkomilev

@vulkomilev, please go ahead with writing the conversion script for BERT! You can follow the same template as RoBERTa's script: https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/convert_roberta_checkpoints.py.

where I can find BertBase(keras_nlp.models.BertBase)?

vulkomilev avatar Dec 06 '22 18:12 vulkomilev

@vulkomilev, KerasNLP does not have a separate class for BertBase. There is a model class for BertBackbone: https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/bert/bert_backbone.py#L35. If you want the base variant of BERT, you can do this:

# w/o loading the weights
bert_base = keras_nlp.models.BertBackbone.from_preset("bert_base_uncased_en", load_weights=False)

# loading the model with the pretrained weights
bert_base = keras_nlp.models.BertBackbone.from_preset("bert_base_uncased_en", load_weights=True)

These "presets" are drawn from here: https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/bert/bert_presets.py.

Regarding checkpoint conversion for BERT, follow the same format as RoBERTa. Use the conversion notebooks mentioned in this directory as reference: https://github.com/keras-team/keras-nlp/tree/master/tools/checkpoint_conversion.

So, for example, contents of this cell


# Model garden BERT paths.
zip_path = f"""https://storage.googleapis.com/tf_model_garden/nlp/bert/v3/{TOKEN_TYPE}_L-12_H-768_A-12.tar.gz"""
zip_file = keras.utils.get_file(
    f"""/content/{MODEL_NAME}""",
    zip_path,
    extract=True,
    archive_format="tar",
)

can go in the download_model() function.

Contents of this cell:

model.get_layer("token_embedding").embeddings.assign(
    weights["encoder/layer_with_weights-0/embeddings/.ATTRIBUTES/VARIABLE_VALUE"]
)
model.get_layer("position_embedding").position_embeddings.assign(
    weights["encoder/layer_with_weights-1/embeddings/.ATTRIBUTES/VARIABLE_VALUE"]
)
model.get_layer("segment_embedding").embeddings.assign(
    weights["encoder/layer_with_weights-2/embeddings/.ATTRIBUTES/VARIABLE_VALUE"]
)
model.get_layer("embeddings_layer_norm").gamma.assign(
    weights["encoder/layer_with_weights-3/gamma/.ATTRIBUTES/VARIABLE_VALUE"]
)
model.get_layer("embeddings_layer_norm").beta.assign(
    weights["encoder/layer_with_weights-3/beta/.ATTRIBUTES/VARIABLE_VALUE"]
)

for i in range(model.num_layers):
    model.get_layer(f"transformer_layer_{i}")._self_attention_layer._key_dense.kernel.assign(
        weights[f"encoder/layer_with_weights-{i + 4}/_attention_layer/_key_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE"]
    )
    model.get_layer(f"transformer_layer_{i}")._self_attention_layer._key_dense.bias.assign(
        weights[f"encoder/layer_with_weights-{i + 4}/_attention_layer/_key_dense/bias/.ATTRIBUTES/VARIABLE_VALUE"]
    )
    model.get_layer(f"transformer_layer_{i}")._self_attention_layer._query_dense.kernel.assign(
        weights[f"encoder/layer_with_weights-{i + 4}/_attention_layer/_query_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE"]
    )
    model.get_layer(f"transformer_layer_{i}")._self_attention_layer._query_dense.bias.assign(
        weights[f"encoder/layer_with_weights-{i + 4}/_attention_layer/_query_dense/bias/.ATTRIBUTES/VARIABLE_VALUE"]
    )
...

can go in convert_checkpoints().

etc., etc.

The conversion script should work for all BERT presets (passed as an arg to the script).

abheesht17 avatar Dec 06 '22 20:12 abheesht17

Hey, @vulkomilev! Are you working on this?

abheesht17 avatar Jan 10 '23 11:01 abheesht17

yes this week I will provide code

На вт, 10.01.2023 г. в 13:54 ч. Abheesht @.***> написа:

Hey, @vulkomilev https://github.com/vulkomilev! Are you working on this?

— Reply to this email directly, view it on GitHub https://github.com/keras-team/keras-nlp/issues/486#issuecomment-1377147766, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATA3WELQC4RQHPPZYESHC53WRVEXRANCNFSM6AAAAAASE4WQIU . You are receiving this because you were mentioned.Message ID: @.***>

vulkomilev avatar Jan 10 '23 17:01 vulkomilev

Hey, @vulkomilev! Are you working on this? Hi I have uploaded the code at https://github.com/vulkomilev/keras-nlp/blob/master/tools/checkpoint_conversion/convert_bert.py but it needs more work .For example I cant find the correct keys in the 'weights' array and I was wondering if the giant "if" in convert_checkpoints can be optimazed

vulkomilev avatar Jan 12 '23 19:01 vulkomilev

abheesht17 I forgot to tag you please check the above message @mattdangerw

vulkomilev avatar Jan 13 '23 11:01 vulkomilev

Hey, @vulkomilev! Taking a look, will get back to you ASAP.

abheesht17 avatar Jan 13 '23 16:01 abheesht17

@vulkomilev, let's reason through this together! :)

First, there are two sources of BERT checkpoints - TF Model Garden and BERT official repository. Let's make a table of which preset is obtained from which source.

Preset Source Notebook URL
bert_tiny_en_uncased BERT Official Repo https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_tiny_uncased_en.ipynb
bert_small_en_uncased BERT Official Repo https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_small_uncased_en.ipynb
bert_medium_en_uncased BERT Official Repo https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_medium_uncased_en.ipynb
bert_base_en_uncased TF Model Garden https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_base_uncased.ipynb
bert_base_en TF Model Garden https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_base_cased.ipynb
bert_base_zh TF Model Garden https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_base_zh.ipynb
bert_base_multi TF Model Garden https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_base_multi_cased.ipynb
bert_large_en_uncased BERT Official Repo https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_large_uncased_en.ipynb
bert_large_en BERT Official Repo https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_large_cased_en.ipynb

Now, once we have the table ready, let's get to work. The expectation is that the conversion snippet (the if...elif...if blocks you are talking about) can be simplified based on source, i.e., ideally, it should be same for all presets derived from "BERT Official Repo". Likewise for "TF Model Garden". Let's test this theory!

Let's run diffchecker between bert_tiny_en_uncased and bert_small_en_uncased: https://www.diffchecker.com/yDWyvec0/. The files are identical, and both are derived from "BERT Official Repo".

Let's test any two presets derived from "TF Model Garden", i.e., difference between bert_base_en_uncased and bert_base_en: https://www.diffchecker.com/Mi4aI5Is/. Ah, shoot! There is a minor difference of two lines, but this isn't a major worry.

So, the conclusion is that you can have an outer if...else to decide between BERT Official Repo and TF Model Garden. Inside this outer if...else block, you can have smaller if...else blocks based on 1-2 lines of variations between conversion scripts (if these differences exist, of course). Hope this clears things up!

abheesht17 avatar Jan 17 '23 14:01 abheesht17

Okay thanks for the information I am on it

vulkomilev avatar Jan 24 '23 21:01 vulkomilev

made a new commit .Can you please check it out @abheesht17

vulkomilev avatar Jan 31 '23 21:01 vulkomilev

bump @abheesht17

vulkomilev avatar Feb 13 '23 17:02 vulkomilev

@vulkomilev, could you please open a PR? That way, everyone can take a look and leave comments on your code. Thanks!

abheesht17 avatar Feb 18 '23 16:02 abheesht17

will do

На сб, 18.02.2023 г. в 18:24 ч. Abheesht @.***> написа:

@vulkomilev https://github.com/vulkomilev, could you please open a PR? Thanks!

— Reply to this email directly, view it on GitHub https://github.com/keras-team/keras-nlp/issues/486#issuecomment-1435710596, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATA3WEIRJP7IYOOPLLIXHKTWYDZUVANCNFSM6AAAAAASE4WQIU . You are receiving this because you were mentioned.Message ID: @.***>

vulkomilev avatar Feb 18 '23 17:02 vulkomilev

done

vulkomilev avatar Feb 19 '23 07:02 vulkomilev

bump @abheesht17

vulkomilev avatar Feb 24 '23 19:02 vulkomilev

Hey, @vulkomilev! I left a few comments on your PR a couple of days ago

abheesht17 avatar Feb 24 '23 19:02 abheesht17

oh I didn't notice sorry

vulkomilev avatar Feb 24 '23 20:02 vulkomilev

hey @vulkomilev are you still working with this issue ?

ADITYADAS1999 avatar Mar 12 '23 03:03 ADITYADAS1999

yes I need to connect with the other members

На нд, 12.03.2023 г. в 5:31 ч. ADITYA DAS @.***> написа:

hey @vulkomilev https://github.com/vulkomilev are you still working with this issue ?

— Reply to this email directly, view it on GitHub https://github.com/keras-team/keras-nlp/issues/486#issuecomment-1465082834, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATA3WEKM4BWVAUCL5OECLCTW3U7P5ANCNFSM6AAAAAASE4WQIU . You are receiving this because you were mentioned.Message ID: @.***>

vulkomilev avatar Mar 15 '23 07:03 vulkomilev

This issue is stale because it has been open for 180 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions[bot] avatar Mar 12 '24 01:03 github-actions[bot]