keras-nlp Convert our checkpoint colabs into runnable scripts

The colabs we currently have in tools/checkpoint_conversion are useful in that we don't loose the code for converting checkpoints. But they are fairly unwieldy. They must be pointed to a specific branch used for the model development, they are a ton of lines of code, and we need one for each model variant.

Instead we could try to write one script per model that handles checkpoint conversion (perhaps with a flag to control the model variant?). Potential file structure.

tools
└── checkpoint_conversion
    ├── README.md
    ├── convert_bert_weights.py
    ├── convert_gpt2_weights.py
    └── requirements.txt

This will make it much easier to re-run and test checkpoint conversion code in the future.

Nov 18 '22 21:11 mattdangerw

I will take it

Nov 28 '22 19:11 vulkomilev

@vulkomilev, please go ahead with writing the conversion script for BERT! You can follow the same template as RoBERTa's script: https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/convert_roberta_checkpoints.py.

Dec 03 '22 03:12 abheesht17

okay

На сб, 3.12.2022 г. в 5:43 ч. Abheesht @.***> написа:

@vulkomilev https://github.com/vulkomilev, please go ahead with writing the conversion script for BERT! You can follow the same template as RoBERTa's script: https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/convert_roberta_checkpoints.py .

— Reply to this email directly, view it on GitHub https://github.com/keras-team/keras-nlp/issues/486#issuecomment-1336035269, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATA3WEM2BLPV5NLLIEGCEV3WLK6X5ANCNFSM6AAAAAASE4WQIU . You are receiving this because you were mentioned.Message ID: @.***>

Dec 03 '22 17:12 vulkomilev

@vulkomilev, please go ahead with writing the conversion script for BERT! You can follow the same template as RoBERTa's script: https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/convert_roberta_checkpoints.py.

where I can find BertBase(keras_nlp.models.BertBase)?

Dec 06 '22 18:12 vulkomilev

@vulkomilev, KerasNLP does not have a separate class for BertBase. There is a model class for BertBackbone: https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/bert/bert_backbone.py#L35. If you want the base variant of BERT, you can do this:

# w/o loading the weights
bert_base = keras_nlp.models.BertBackbone.from_preset("bert_base_uncased_en", load_weights=False)

# loading the model with the pretrained weights
bert_base = keras_nlp.models.BertBackbone.from_preset("bert_base_uncased_en", load_weights=True)

These "presets" are drawn from here: https://github.com/keras-team/keras-nlp/blob/master/keras_nlp/models/bert/bert_presets.py.

Regarding checkpoint conversion for BERT, follow the same format as RoBERTa. Use the conversion notebooks mentioned in this directory as reference: https://github.com/keras-team/keras-nlp/tree/master/tools/checkpoint_conversion.

So, for example, contents of this cell


# Model garden BERT paths.
zip_path = f"""https://storage.googleapis.com/tf_model_garden/nlp/bert/v3/{TOKEN_TYPE}_L-12_H-768_A-12.tar.gz"""
zip_file = keras.utils.get_file(
    f"""/content/{MODEL_NAME}""",
    zip_path,
    extract=True,
    archive_format="tar",
)

can go in the download_model() function.

Contents of this cell:

model.get_layer("token_embedding").embeddings.assign(
    weights["encoder/layer_with_weights-0/embeddings/.ATTRIBUTES/VARIABLE_VALUE"]
)
model.get_layer("position_embedding").position_embeddings.assign(
    weights["encoder/layer_with_weights-1/embeddings/.ATTRIBUTES/VARIABLE_VALUE"]
)
model.get_layer("segment_embedding").embeddings.assign(
    weights["encoder/layer_with_weights-2/embeddings/.ATTRIBUTES/VARIABLE_VALUE"]
)
model.get_layer("embeddings_layer_norm").gamma.assign(
    weights["encoder/layer_with_weights-3/gamma/.ATTRIBUTES/VARIABLE_VALUE"]
)
model.get_layer("embeddings_layer_norm").beta.assign(
    weights["encoder/layer_with_weights-3/beta/.ATTRIBUTES/VARIABLE_VALUE"]
)

for i in range(model.num_layers):
    model.get_layer(f"transformer_layer_{i}")._self_attention_layer._key_dense.kernel.assign(
        weights[f"encoder/layer_with_weights-{i + 4}/_attention_layer/_key_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE"]
    )
    model.get_layer(f"transformer_layer_{i}")._self_attention_layer._key_dense.bias.assign(
        weights[f"encoder/layer_with_weights-{i + 4}/_attention_layer/_key_dense/bias/.ATTRIBUTES/VARIABLE_VALUE"]
    )
    model.get_layer(f"transformer_layer_{i}")._self_attention_layer._query_dense.kernel.assign(
        weights[f"encoder/layer_with_weights-{i + 4}/_attention_layer/_query_dense/kernel/.ATTRIBUTES/VARIABLE_VALUE"]
    )
    model.get_layer(f"transformer_layer_{i}")._self_attention_layer._query_dense.bias.assign(
        weights[f"encoder/layer_with_weights-{i + 4}/_attention_layer/_query_dense/bias/.ATTRIBUTES/VARIABLE_VALUE"]
    )
...

can go in convert_checkpoints().

etc., etc.

The conversion script should work for all BERT presets (passed as an arg to the script).

Dec 06 '22 20:12 abheesht17

Hey, @vulkomilev! Are you working on this?

Jan 10 '23 11:01 abheesht17

yes this week I will provide code

На вт, 10.01.2023 г. в 13:54 ч. Abheesht @.***> написа:

Hey, @vulkomilev https://github.com/vulkomilev! Are you working on this?

— Reply to this email directly, view it on GitHub https://github.com/keras-team/keras-nlp/issues/486#issuecomment-1377147766, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATA3WELQC4RQHPPZYESHC53WRVEXRANCNFSM6AAAAAASE4WQIU . You are receiving this because you were mentioned.Message ID: @.***>

Jan 10 '23 17:01 vulkomilev

Hey, @vulkomilev! Are you working on this? Hi I have uploaded the code at https://github.com/vulkomilev/keras-nlp/blob/master/tools/checkpoint_conversion/convert_bert.py but it needs more work .For example I cant find the correct keys in the 'weights' array and I was wondering if the giant "if" in convert_checkpoints can be optimazed

Jan 12 '23 19:01 vulkomilev

abheesht17 I forgot to tag you please check the above message @mattdangerw

Jan 13 '23 11:01 vulkomilev

Hey, @vulkomilev! Taking a look, will get back to you ASAP.

Jan 13 '23 16:01 abheesht17

@vulkomilev, let's reason through this together! :)

First, there are two sources of BERT checkpoints - TF Model Garden and BERT official repository. Let's make a table of which preset is obtained from which source.

Preset	Source	Notebook URL
`bert_tiny_en_uncased`	BERT Official Repo	https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_tiny_uncased_en.ipynb
`bert_small_en_uncased`	BERT Official Repo	https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_small_uncased_en.ipynb
`bert_medium_en_uncased`	BERT Official Repo	https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_medium_uncased_en.ipynb
`bert_base_en_uncased`	TF Model Garden	https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_base_uncased.ipynb
`bert_base_en`	TF Model Garden	https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_base_cased.ipynb
`bert_base_zh`	TF Model Garden	https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_base_zh.ipynb
`bert_base_multi`	TF Model Garden	https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_base_multi_cased.ipynb
`bert_large_en_uncased`	BERT Official Repo	https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_large_uncased_en.ipynb
`bert_large_en`	BERT Official Repo	https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_large_cased_en.ipynb

Now, once we have the table ready, let's get to work. The expectation is that the conversion snippet (the if...elif...if blocks you are talking about) can be simplified based on source, i.e., ideally, it should be same for all presets derived from "BERT Official Repo". Likewise for "TF Model Garden". Let's test this theory!

Let's run diffchecker between bert_tiny_en_uncased and bert_small_en_uncased: https://www.diffchecker.com/yDWyvec0/. The files are identical, and both are derived from "BERT Official Repo".

Let's test any two presets derived from "TF Model Garden", i.e., difference between bert_base_en_uncased and bert_base_en: https://www.diffchecker.com/Mi4aI5Is/. Ah, shoot! There is a minor difference of two lines, but this isn't a major worry.

So, the conclusion is that you can have an outer if...else to decide between BERT Official Repo and TF Model Garden. Inside this outer if...else block, you can have smaller if...else blocks based on 1-2 lines of variations between conversion scripts (if these differences exist, of course). Hope this clears things up!

Jan 17 '23 14:01 abheesht17

Okay thanks for the information I am on it

Jan 24 '23 21:01 vulkomilev

made a new commit .Can you please check it out @abheesht17

Jan 31 '23 21:01 vulkomilev

bump @abheesht17

Feb 13 '23 17:02 vulkomilev

@vulkomilev, could you please open a PR? That way, everyone can take a look and leave comments on your code. Thanks!

Feb 18 '23 16:02 abheesht17

will do

На сб, 18.02.2023 г. в 18:24 ч. Abheesht @.***> написа:

@vulkomilev https://github.com/vulkomilev, could you please open a PR? Thanks!

— Reply to this email directly, view it on GitHub https://github.com/keras-team/keras-nlp/issues/486#issuecomment-1435710596, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATA3WEIRJP7IYOOPLLIXHKTWYDZUVANCNFSM6AAAAAASE4WQIU . You are receiving this because you were mentioned.Message ID: @.***>

Feb 18 '23 17:02 vulkomilev

done

Feb 19 '23 07:02 vulkomilev

bump @abheesht17

Feb 24 '23 19:02 vulkomilev

Hey, @vulkomilev! I left a few comments on your PR a couple of days ago

Feb 24 '23 19:02 abheesht17

oh I didn't notice sorry

Feb 24 '23 20:02 vulkomilev

hey @vulkomilev are you still working with this issue ?

Mar 12 '23 03:03 ADITYADAS1999

yes I need to connect with the other members

На нд, 12.03.2023 г. в 5:31 ч. ADITYA DAS @.***> написа:

hey @vulkomilev https://github.com/vulkomilev are you still working with this issue ?

— Reply to this email directly, view it on GitHub https://github.com/keras-team/keras-nlp/issues/486#issuecomment-1465082834, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATA3WEKM4BWVAUCL5OECLCTW3U7P5ANCNFSM6AAAAAASE4WQIU . You are receiving this because you were mentioned.Message ID: @.***>

Mar 15 '23 07:03 vulkomilev

This issue is stale because it has been open for 180 days with no activity. It will be closed if no further activity occurs. Thank you.

Mar 12 '24 01:03 github-actions[bot]

keras-nlp keras-nlp copied to clipboard

Convert our checkpoint colabs into runnable scripts

keras-nlp
keras-nlp copied to clipboard