transformers How to get T5 decoded logits using TFT5ForConditionalGeneration from encoded outputs?

System Info

transformers version: 4.24.0
Platform: Linux-6.1.11-76060111-generic-x86_64-with-glibc2.35
Python version: 3.10.9
Huggingface_hub version: 0.10.1
PyTorch version (GPU?): 1.12.1 (False)
Tensorflow version (GPU?): 2.10.0 (False)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?:
Using distributed or parallel set-up in script?:

Who can help?

@Rocketknight1 @gante

Information

[ ] The official example scripts
[ ] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[X] My own task or dataset (give details below)

Reproduction

import numpy as np
import tensorflow as tf
from transformers import AutoTokenizer, T5Config, TFT5ForConditionalGeneration

distill_config = T5Config(d_model=256, d_kv = 32, d_ff=512, num_heads=4, decoder_start_token_id=0)
tf_model = TFT5ForConditionalGeneration(config=distill_config)
tokenizer = AutoTokenizer.from_pretrained("t5-small", padding='max_length', truncation=True)
inputs =  tokenizer("this is a random input", return_tensors="tf")['input_ids']

encoder_outputs = tf_model.encoder(inputs)
decoder_input_ids = tf.convert_to_tensor(np.asarray([[0]]).astype(np.int32))
output = tf_model.decoder(decoder_input_ids = decoder_input_ids, encoder_outputs=encoder_outputs.last_hidden_state)

Error:

ValueError                                Traceback (most recent call last)
<ipython-input-5-face8f4fd36f> in <module>
     10 encoder_outputs = tf_model.encoder(inputs)
     11 decoder_input_ids = tf.convert_to_tensor(np.asarray([[0]]).astype(np.int32))
---> 12 output = tf_model.decoder(decoder_input_ids = decoder_input_ids, encoder_outputs=encoder_outputs.last_hidden_state)

1 frames
/usr/local/lib/python3.9/dist-packages/keras/utils/traceback_utils.py in error_handler(*args, **kwargs)
     68             # To get the full stack trace, call:
     69             # `tf.debugging.disable_traceback_filtering()`
---> 70             raise e.with_traceback(filtered_tb) from None
     71         finally:
     72             del filtered_tb

/usr/local/lib/python3.9/dist-packages/keras/utils/layer_utils.py in split_out_first_arg(self, args, kwargs)
    807             inputs = kwargs.pop(self._arg_names[0])
    808         else:
--> 809             raise ValueError(
    810                 "The first argument to `Layer.call` must always be passed."
    811             )

ValueError: The first argument to `Layer.call` must always be passed.

Expected behavior

I am trying to convert a TFT5ForConditionalGeneration with custom config into a TFLite model, and as far as I see, implementing a greedy approach on my own seems faster, but if you know a more straightforward process, please let me know.

I am currently trying to generate the decoder output using the encoder output, which I will generate only the first time when I pass the entire sentence. And then, I tried to reuse this encoded vector for the rest of the greedy search as input for the decoder.

Mar 18 '23 04:03 FrozenWolf-Cyber

Without using an encoded vector, this gives me the required output:

import tensorflow as tf
from transformers import AutoTokenizer, T5Config, TFT5ForConditionalGeneration, set_seed
set_seed(0)
tokenizer = AutoTokenizer.from_pretrained("t5-small", padding='max_length', truncation=True)
tf_model = TFT5ForConditionalGeneration.from_pretrained("t5-small")
inputs = tokenizer("i got permission to begin a start up company by my own..</s>",return_tensors='tf')
attn = inputs['attention_mask']

decoder_input = tf.zeros((1,1), dtype=tf.int64)
output = tf_model(input_ids=inputs['input_ids'], attention_mask = attn, decoder_input_ids=decoder_input).logits

print(tokenizer.batch_decode(output.numpy().argmax(-1).tolist()), output.numpy().argmax(-1).tolist())

Output:

[''] [[3]]

But I get a different answer when I try to use the encoded vector as below.

import tensorflow as tf
from transformers import AutoTokenizer, T5Config, TFT5ForConditionalGeneration, set_seed
set_seed(0)
tokenizer = AutoTokenizer.from_pretrained("t5-small", padding='max_length', truncation=True)
tf_model = TFT5ForConditionalGeneration.from_pretrained("t5-small")
inputs = tokenizer("i got permission to begin a start up company by my own..</s>",return_tensors='tf')
attn = inputs['attention_mask']

encoder_outputs = tf_model.encoder(inputs['input_ids'], attention_mask = attn, return_dict = True)
output = tf_model.decoder(decoder_input, encoder_hidden_states=encoder_outputs.last_hidden_state).last_hidden_state

print(tokenizer.batch_decode(output.numpy().argmax(-1).tolist()), output.numpy().argmax(-1).tolist())

Output:

['une'] [[245]]

Mar 19 '23 06:03 FrozenWolf-Cyber

Hi @FrozenWolf-Cyber, thanks for raising this issue.

This difference is arising because the two scripts are not equivalent. In the forward pass of the T5 model, the output of the decoder is passed to the language model head to produce the outputs - see the relevant lines here.

Mar 20 '23 13:03 amyeroberts

@amyeroberts Thanks for replying,

I tried do:

tf_model.lm_head(output[0])

But I seem to be getting the following error:

AttributeError                            Traceback (most recent call last)
[<ipython-input-13-8324bea7f5ea>](https://localhost:8080/#) in <module>
----> 1 tf_model.lm_head(output[0])
AttributeError: 'TFT5ForConditionalGeneration' object has no attribute 'lm_head'

Mar 21 '23 02:03 FrozenWolf-Cyber

This is because, for the "t5-small" checkpoint config, tie_word_embeddings==True. In this case, there isn't a lm_head layer, and instead the shared weights are used. The relevant lines are here.

Mar 21 '23 10:03 amyeroberts

import tensorflow as tf
from transformers import AutoTokenizer, T5Config, TFT5ForConditionalGeneration, set_seed
set_seed(0)
tokenizer = AutoTokenizer.from_pretrained("t5-small", padding='max_length', truncation=True)
tf_model = TFT5ForConditionalGeneration.from_pretrained("t5-small")
inputs = tokenizer("i got permission to begin a start up company by my own..</s>",return_tensors='tf')
attn = inputs['attention_mask']

encoder_outputs = tf_model.encoder(inputs['input_ids'], attention_mask = attn)
decoder_input = tf.zeros((1,1), dtype=tf.int64)
sequence_output = tf_model.decoder(decoder_input, encoder_hidden_states=encoder_outputs[0])[0]
sequence_output = sequence_output * (tf_model.model_dim**-0.5)
logits = tf.matmul(sequence_output, tf_model.shared.weights, transpose_b=True)

print(tokenizer.batch_decode(logits.numpy().argmax(-1).tolist()))

@amyeroberts Thank you very much this code works now :)

Mar 21 '23 13:03 FrozenWolf-Cyber

transformers transformers copied to clipboard

How to get T5 decoded logits using TFT5ForConditionalGeneration from encoded outputs?

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

transformers
transformers copied to clipboard