portuguese-bert icon indicating copy to clipboard operation
portuguese-bert copied to clipboard

Can I use this model as a layer of a larger model?

Open Benjamim-EP opened this issue 2 years ago • 3 comments

I would like to know how I can use this template as in the example below

` class DCNNBERTEmbedding(tf.keras.Model):

def __init__(self,
             nb_filters=50,
             FFN_units=512,
             nb_classes=2,
             dropout_rate=0.1,
             name="dcnn"):
    super(DCNNBERTEmbedding, self).__init__(name=name)
    
    # Layer embedding  bert
    self.bert_layer = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/1",name = "bert",
                                     trainable = False)

    self.bigram = layers.Conv1D(filters=nb_filters,
                                kernel_size=2,
                                padding="valid",
                                activation="relu")
    self.trigram = layers.Conv1D(filters=nb_filters,
                                 kernel_size=3,
                                 padding="valid",
                                 activation="relu")
    self.fourgram = layers.Conv1D(filters=nb_filters,
                                  kernel_size=4,
                                  padding="valid",
                                  activation="relu")
    self.pool = layers.GlobalMaxPool1D()
    self.dense_1 = layers.Dense(units=FFN_units, activation="relu")
    self.dropout = layers.Dropout(rate=dropout_rate)
    if nb_classes == 2:
        self.last_dense = layers.Dense(units=1,
                                       activation="sigmoid")
    else:
        self.last_dense = layers.Dense(units=nb_classes,
                                       activation="softmax")
# Fazer embedding com bert
def embed_with_bert(self, all_tokens):
  # Lembrar dos parametros retornados pelo bert_layers, o primeiro relacionado a sentença inteira
   # O segundo relacionado aos embedding, então queremos só o segundo retorno
  _, embs = self.bert_layer([all_tokens[:, 0, :], # [: (todos os tokens), 0 (os ids), : (tudo que tiver no restante)]
                             all_tokens[:, 1, :], # [:,1 (mascara),:]
                             all_tokens[:, 2, :]])
  return embs

# Função para buscar a camada de embedding
def call(self, inputs, training):
    x = self.embed_with_bert(inputs)
    
    x_1 = self.bigram(x)
    x_1 = self.pool(x_1)
    x_2 = self.trigram(x)
    x_2 = self.pool(x_2)
    x_3 = self.fourgram(x)
    x_3 = self.pool(x_3)
    
    merged = tf.concat([x_1, x_2, x_3], axis=-1) # (batch_size, 3 * nb_filters)
    merged = self.dense_1(merged)
    merged = self.dropout(merged, training)
    output = self.last_dense(merged)
    
    return output

`

Benjamim-EP avatar Jul 20 '21 17:07 Benjamim-EP

Hi @Benjamim-EP ,

I am not a TensorFlow user, so unfortunately I can't give you directions. But it should be possible to adapt a working example for English BERT (or other language) using BERTimbau TensorFlow checkpoint (weights) and config file. I'll leave this issue open so others may help you. Please share your experience with us if you find a solution :)

fabiocapsouza avatar Jul 22 '21 14:07 fabiocapsouza

Hi @Benjamim-EP and @fabiocapsouza , using this model as part of another model should be straightforward, here is a minimal example, that uses BERT as base and adds a classifier head at the top:

# Load BERT with the HF API
encoder = TFBertModel.from_pretrained('path/to/bert_dir/', from_pt=True)

# Build model composed with BERT
# Input layers (from the tokenizer)
input_ids = tf.keras.layers.Input(shape=(None,), dtype=tf.int32, name='input_ids')
token_type_ids = tf.keras.layers.Input(shape=(None,), dtype=tf.int32, name='token_type_ids')
attention_mask = tf.keras.layers.Input(shape=(None,), dtype=tf.int32, name='attention_mask')

# BERT encoder
encoded = encoder({"input_ids": input_ids, 
                   "token_type_ids": token_type_ids, 
                   "attention_mask": attention_mask})['pooler_output']
# Classifier head
outputs = tf.keras.layers.Dense(n_classes, activation='softmax', name='classifier')(encoded)

# Build the model
model = tf.keras.models.Model(inputs=[input_ids, token_type_ids, attention_mask], outputs=outputs)

The only issue I faced is that the TFBertModel is not able to load the files from the Tensorflow checkpoints, so you need to load from PyTorch and use from_pt=True

dimitreOliveira avatar Feb 21 '22 17:02 dimitreOliveira

I believe this example should be in the README as an example of how to use it with Tensorflow. ;)

jvanz avatar Mar 09 '22 11:03 jvanz