portuguese-bert
portuguese-bert copied to clipboard
Can I use this model as a layer of a larger model?
I would like to know how I can use this template as in the example below
` class DCNNBERTEmbedding(tf.keras.Model):
def __init__(self,
nb_filters=50,
FFN_units=512,
nb_classes=2,
dropout_rate=0.1,
name="dcnn"):
super(DCNNBERTEmbedding, self).__init__(name=name)
# Layer embedding bert
self.bert_layer = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_L-12_H-768_A-12/1",name = "bert",
trainable = False)
self.bigram = layers.Conv1D(filters=nb_filters,
kernel_size=2,
padding="valid",
activation="relu")
self.trigram = layers.Conv1D(filters=nb_filters,
kernel_size=3,
padding="valid",
activation="relu")
self.fourgram = layers.Conv1D(filters=nb_filters,
kernel_size=4,
padding="valid",
activation="relu")
self.pool = layers.GlobalMaxPool1D()
self.dense_1 = layers.Dense(units=FFN_units, activation="relu")
self.dropout = layers.Dropout(rate=dropout_rate)
if nb_classes == 2:
self.last_dense = layers.Dense(units=1,
activation="sigmoid")
else:
self.last_dense = layers.Dense(units=nb_classes,
activation="softmax")
# Fazer embedding com bert
def embed_with_bert(self, all_tokens):
# Lembrar dos parametros retornados pelo bert_layers, o primeiro relacionado a sentença inteira
# O segundo relacionado aos embedding, então queremos só o segundo retorno
_, embs = self.bert_layer([all_tokens[:, 0, :], # [: (todos os tokens), 0 (os ids), : (tudo que tiver no restante)]
all_tokens[:, 1, :], # [:,1 (mascara),:]
all_tokens[:, 2, :]])
return embs
# Função para buscar a camada de embedding
def call(self, inputs, training):
x = self.embed_with_bert(inputs)
x_1 = self.bigram(x)
x_1 = self.pool(x_1)
x_2 = self.trigram(x)
x_2 = self.pool(x_2)
x_3 = self.fourgram(x)
x_3 = self.pool(x_3)
merged = tf.concat([x_1, x_2, x_3], axis=-1) # (batch_size, 3 * nb_filters)
merged = self.dense_1(merged)
merged = self.dropout(merged, training)
output = self.last_dense(merged)
return output
`
Hi @Benjamim-EP ,
I am not a TensorFlow user, so unfortunately I can't give you directions. But it should be possible to adapt a working example for English BERT (or other language) using BERTimbau TensorFlow checkpoint (weights) and config file. I'll leave this issue open so others may help you. Please share your experience with us if you find a solution :)
Hi @Benjamim-EP and @fabiocapsouza , using this model as part of another model should be straightforward, here is a minimal example, that uses BERT as base and adds a classifier head at the top:
# Load BERT with the HF API
encoder = TFBertModel.from_pretrained('path/to/bert_dir/', from_pt=True)
# Build model composed with BERT
# Input layers (from the tokenizer)
input_ids = tf.keras.layers.Input(shape=(None,), dtype=tf.int32, name='input_ids')
token_type_ids = tf.keras.layers.Input(shape=(None,), dtype=tf.int32, name='token_type_ids')
attention_mask = tf.keras.layers.Input(shape=(None,), dtype=tf.int32, name='attention_mask')
# BERT encoder
encoded = encoder({"input_ids": input_ids,
"token_type_ids": token_type_ids,
"attention_mask": attention_mask})['pooler_output']
# Classifier head
outputs = tf.keras.layers.Dense(n_classes, activation='softmax', name='classifier')(encoded)
# Build the model
model = tf.keras.models.Model(inputs=[input_ids, token_type_ids, attention_mask], outputs=outputs)
The only issue I faced is that the TFBertModel
is not able to load the files from the Tensorflow
checkpoints, so you need to load from PyTorch
and use from_pt=True
I believe this example should be in the README as an example of how to use it with Tensorflow. ;)