transformers
transformers copied to clipboard
TFEncoderDecoderModel can not be trained with TF Keras fit() method
System Info
transformersversion: 4.21.0- Platform: Linux-4.15.0-188-generic-x86_64-with-glibc2.31
- Python version: 3.9.12
- Huggingface_hub version: 0.8.1
- PyTorch version (GPU?): not installed (NA)
- Tensorflow version (GPU?): 2.6.2 (True)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using GPU in script?:
- Using distributed or parallel set-up in script?:
Who can help?
No response
Information
- [x] The official example scripts
- [ ] My own modified scripts
Tasks
- [x] An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
Use the example here: https://huggingface.co/docs/transformers/v4.20.1/en/model_doc/encoder-decoder#transformers.TFEncoderDecoderModel.call.example
- try to fit the model: model.fit(input_ids=input_ids, decoder_input_ids=input_ids)
- You will receive errors "TypeError: fit() got an unexpected keyword argument 'input_ids'"

- you can try this : model.fit(input_ids, input_ids)
- but you receive many errors:

Expected behavior
I should be able to train a TFEncoderDecoderModel with TF Keras fit() method
Hi @kmkarakaya 👋 Having a popular project like transformers means we get many support and feature requests — if we want to maximize how much we help the community, the community has to help us stay productive 🙏
To that end, please share a short script where the issue is clearly reproducible on any computer. Thank you 🤗
Hi @gante,
Here is the script ( https://huggingface.co/docs/transformers/v4.20.1/en/model_doc/encoder-decoder#transformers.TFEncoderDecoderModel.call.example ) which I modified it to train the model as below:
import tensorflow as tf from transformers import TFEncoderDecoderModel, BertTokenizer model = TFEncoderDecoderModel.from_encoder_decoder_pretrained("bert-base-cased", "gpt2") tokenizer = BertTokenizer.from_pretrained("bert-base-cased") model.compile(loss=None) model.fit(input_ids=input_ids, decoder_input_ids=input_ids, labels=input_ids)
The error message: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Hi @kmkarakaya -- technically I can't reproduce the script, since I don't have access to your input_ids.
However, looking at the code, I can tell that model.fit is not being called correctly. Please check its documentation, especially its x and y arguments :)
@gante As I wrote in every message this code belongs to the HF repo https://huggingface.co/docs/transformers/v4.20.1/en/model_doc/encoder-decoder#transformers.TFEncoderDecoderModel.call.example
here is the complete & full code from the HF link: I hope this time you can help to fix the problem:
from transformers import TFEncoderDecoderModel, BertTokenizer model = TFEncoderDecoderModel.from_encoder_decoder_pretrained("bert-base-cased", "gpt2") tokenizer = BertTokenizer.from_pretrained("bert-base-cased") input_ids = tokenizer.encode( "Hello, my dog is cute", add_special_tokens=True, return_tensors="tf" ) model.compile(loss=None) model.fit(input_ids=input_ids, decoder_input_ids=input_ids, labels=input_ids)
@gante Please note that my question is related to TFEncoderDecoderModel therefore, model.fit(x,y) is not enough! We need to provide encoder input, decoder input and decoder output as the HF suggests in its official documentation: https://huggingface.co/docs/transformers/v4.20.1/en/model_doc/encoder-decoder#transformers.TFEncoderDecoderModel.call.example
Thus, this bug's title is "TFEncoderDecoderModel can not be trained with TF Keras fit() method". If you know how to train TFEncoderDecoderModel with TF or Keras please share with me.
Because in the current model.fit() I am not able to do it. Thank you for your attention.
Hi @kmkarakaya -- the example you linked runs fine and, as I've written above, the issue with your example is in the arguments to model.fit.
Please see our examples to learn how to prepare the data for training. For instance, see here -- you need to prepare your data into a dataset in advance.
Finally, as per our issues guidelines, we reserve GitHub issues for bugs in the repository and/or feature requests. For any other matters, we'd like to invite you to use our forum 🤗
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.