transformers
transformers copied to clipboard
How to get T5 decoded logits using TFT5ForConditionalGeneration from encoded outputs?
System Info
-
transformers
version: 4.24.0 - Platform: Linux-6.1.11-76060111-generic-x86_64-with-glibc2.35
- Python version: 3.10.9
- Huggingface_hub version: 0.10.1
- PyTorch version (GPU?): 1.12.1 (False)
- Tensorflow version (GPU?): 2.10.0 (False)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Who can help?
@Rocketknight1 @gante
Information
- [ ] The official example scripts
- [ ] My own modified scripts
Tasks
- [ ] An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - [X] My own task or dataset (give details below)
Reproduction
import numpy as np
import tensorflow as tf
from transformers import AutoTokenizer, T5Config, TFT5ForConditionalGeneration
distill_config = T5Config(d_model=256, d_kv = 32, d_ff=512, num_heads=4, decoder_start_token_id=0)
tf_model = TFT5ForConditionalGeneration(config=distill_config)
tokenizer = AutoTokenizer.from_pretrained("t5-small", padding='max_length', truncation=True)
inputs = tokenizer("this is a random input", return_tensors="tf")['input_ids']
encoder_outputs = tf_model.encoder(inputs)
decoder_input_ids = tf.convert_to_tensor(np.asarray([[0]]).astype(np.int32))
output = tf_model.decoder(decoder_input_ids = decoder_input_ids, encoder_outputs=encoder_outputs.last_hidden_state)
Error:
ValueError Traceback (most recent call last)
<ipython-input-5-face8f4fd36f> in <module>
10 encoder_outputs = tf_model.encoder(inputs)
11 decoder_input_ids = tf.convert_to_tensor(np.asarray([[0]]).astype(np.int32))
---> 12 output = tf_model.decoder(decoder_input_ids = decoder_input_ids, encoder_outputs=encoder_outputs.last_hidden_state)
1 frames
/usr/local/lib/python3.9/dist-packages/keras/utils/traceback_utils.py in error_handler(*args, **kwargs)
68 # To get the full stack trace, call:
69 # `tf.debugging.disable_traceback_filtering()`
---> 70 raise e.with_traceback(filtered_tb) from None
71 finally:
72 del filtered_tb
/usr/local/lib/python3.9/dist-packages/keras/utils/layer_utils.py in split_out_first_arg(self, args, kwargs)
807 inputs = kwargs.pop(self._arg_names[0])
808 else:
--> 809 raise ValueError(
810 "The first argument to `Layer.call` must always be passed."
811 )
ValueError: The first argument to `Layer.call` must always be passed.
Expected behavior
I am trying to convert a TFT5ForConditionalGeneration with custom config into a TFLite model, and as far as I see, implementing a greedy approach on my own seems faster, but if you know a more straightforward process, please let me know.
I am currently trying to generate the decoder output using the encoder output, which I will generate only the first time when I pass the entire sentence. And then, I tried to reuse this encoded vector for the rest of the greedy search as input for the decoder.
Without using an encoded vector, this gives me the required output:
import tensorflow as tf
from transformers import AutoTokenizer, T5Config, TFT5ForConditionalGeneration, set_seed
set_seed(0)
tokenizer = AutoTokenizer.from_pretrained("t5-small", padding='max_length', truncation=True)
tf_model = TFT5ForConditionalGeneration.from_pretrained("t5-small")
inputs = tokenizer("i got permission to begin a start up company by my own..</s>",return_tensors='tf')
attn = inputs['attention_mask']
decoder_input = tf.zeros((1,1), dtype=tf.int64)
output = tf_model(input_ids=inputs['input_ids'], attention_mask = attn, decoder_input_ids=decoder_input).logits
print(tokenizer.batch_decode(output.numpy().argmax(-1).tolist()), output.numpy().argmax(-1).tolist())
Output:
[''] [[3]]
But I get a different answer when I try to use the encoded vector as below.
import tensorflow as tf
from transformers import AutoTokenizer, T5Config, TFT5ForConditionalGeneration, set_seed
set_seed(0)
tokenizer = AutoTokenizer.from_pretrained("t5-small", padding='max_length', truncation=True)
tf_model = TFT5ForConditionalGeneration.from_pretrained("t5-small")
inputs = tokenizer("i got permission to begin a start up company by my own..</s>",return_tensors='tf')
attn = inputs['attention_mask']
encoder_outputs = tf_model.encoder(inputs['input_ids'], attention_mask = attn, return_dict = True)
output = tf_model.decoder(decoder_input, encoder_hidden_states=encoder_outputs.last_hidden_state).last_hidden_state
print(tokenizer.batch_decode(output.numpy().argmax(-1).tolist()), output.numpy().argmax(-1).tolist())
Output:
['une'] [[245]]
Hi @FrozenWolf-Cyber, thanks for raising this issue.
This difference is arising because the two scripts are not equivalent. In the forward pass of the T5 model, the output of the decoder is passed to the language model head to produce the outputs - see the relevant lines here.
@amyeroberts Thanks for replying,
I tried do:
tf_model.lm_head(output[0])
But I seem to be getting the following error:
AttributeError Traceback (most recent call last)
[<ipython-input-13-8324bea7f5ea>](https://localhost:8080/#) in <module>
----> 1 tf_model.lm_head(output[0])
AttributeError: 'TFT5ForConditionalGeneration' object has no attribute 'lm_head'
This is because, for the "t5-small"
checkpoint config, tie_word_embeddings==True
. In this case, there isn't a lm_head
layer, and instead the shared weights are used. The relevant lines are here.
import tensorflow as tf
from transformers import AutoTokenizer, T5Config, TFT5ForConditionalGeneration, set_seed
set_seed(0)
tokenizer = AutoTokenizer.from_pretrained("t5-small", padding='max_length', truncation=True)
tf_model = TFT5ForConditionalGeneration.from_pretrained("t5-small")
inputs = tokenizer("i got permission to begin a start up company by my own..</s>",return_tensors='tf')
attn = inputs['attention_mask']
encoder_outputs = tf_model.encoder(inputs['input_ids'], attention_mask = attn)
decoder_input = tf.zeros((1,1), dtype=tf.int64)
sequence_output = tf_model.decoder(decoder_input, encoder_hidden_states=encoder_outputs[0])[0]
sequence_output = sequence_output * (tf_model.model_dim**-0.5)
logits = tf.matmul(sequence_output, tf_model.shared.weights, transpose_b=True)
print(tokenizer.batch_decode(logits.numpy().argmax(-1).tolist()))
@amyeroberts Thank you very much this code works now :)