transformers icon indicating copy to clipboard operation
transformers copied to clipboard

TFEncoderDecoderModel can not be trained with TF Keras fit() method

Open kmkarakaya opened this issue 3 years ago • 3 comments

System Info

  • transformers version: 4.21.0
  • Platform: Linux-4.15.0-188-generic-x86_64-with-glibc2.31
  • Python version: 3.9.12
  • Huggingface_hub version: 0.8.1
  • PyTorch version (GPU?): not installed (NA)
  • Tensorflow version (GPU?): 2.6.2 (True)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

No response

Information

  • [x] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [x] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

Use the example here: https://huggingface.co/docs/transformers/v4.20.1/en/model_doc/encoder-decoder#transformers.TFEncoderDecoderModel.call.example

  1. try to fit the model: model.fit(input_ids=input_ids, decoder_input_ids=input_ids)
  2. You will receive errors "TypeError: fit() got an unexpected keyword argument 'input_ids'"

image

  1. you can try this : model.fit(input_ids, input_ids)
  2. but you receive many errors: image

Expected behavior

I should be able to train a TFEncoderDecoderModel with TF Keras fit() method

kmkarakaya avatar Aug 04 '22 12:08 kmkarakaya

Hi @kmkarakaya 👋 Having a popular project like transformers means we get many support and feature requests — if we want to maximize how much we help the community, the community has to help us stay productive 🙏

To that end, please share a short script where the issue is clearly reproducible on any computer. Thank you 🤗

gante avatar Aug 05 '22 12:08 gante

Hi @gante,
Here is the script ( https://huggingface.co/docs/transformers/v4.20.1/en/model_doc/encoder-decoder#transformers.TFEncoderDecoderModel.call.example ) which I modified it to train the model as below:

import tensorflow as tf from transformers import TFEncoderDecoderModel, BertTokenizer model = TFEncoderDecoderModel.from_encoder_decoder_pretrained("bert-base-cased", "gpt2") tokenizer = BertTokenizer.from_pretrained("bert-base-cased") model.compile(loss=None) model.fit(input_ids=input_ids, decoder_input_ids=input_ids, labels=input_ids)

The error message: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

kmkarakaya avatar Aug 09 '22 10:08 kmkarakaya

Hi @kmkarakaya -- technically I can't reproduce the script, since I don't have access to your input_ids.

However, looking at the code, I can tell that model.fit is not being called correctly. Please check its documentation, especially its x and y arguments :)

gante avatar Aug 09 '22 10:08 gante

@gante As I wrote in every message this code belongs to the HF repo https://huggingface.co/docs/transformers/v4.20.1/en/model_doc/encoder-decoder#transformers.TFEncoderDecoderModel.call.example

here is the complete & full code from the HF link: I hope this time you can help to fix the problem:

from transformers import TFEncoderDecoderModel, BertTokenizer model = TFEncoderDecoderModel.from_encoder_decoder_pretrained("bert-base-cased", "gpt2") tokenizer = BertTokenizer.from_pretrained("bert-base-cased") input_ids = tokenizer.encode( "Hello, my dog is cute", add_special_tokens=True, return_tensors="tf" ) model.compile(loss=None) model.fit(input_ids=input_ids, decoder_input_ids=input_ids, labels=input_ids)

kmkarakaya avatar Aug 10 '22 12:08 kmkarakaya

@gante Please note that my question is related to TFEncoderDecoderModel therefore, model.fit(x,y) is not enough! We need to provide encoder input, decoder input and decoder output as the HF suggests in its official documentation: https://huggingface.co/docs/transformers/v4.20.1/en/model_doc/encoder-decoder#transformers.TFEncoderDecoderModel.call.example

Thus, this bug's title is "TFEncoderDecoderModel can not be trained with TF Keras fit() method". If you know how to train TFEncoderDecoderModel with TF or Keras please share with me.

Because in the current model.fit() I am not able to do it. Thank you for your attention.

kmkarakaya avatar Aug 10 '22 12:08 kmkarakaya

Hi @kmkarakaya -- the example you linked runs fine and, as I've written above, the issue with your example is in the arguments to model.fit.

Please see our examples to learn how to prepare the data for training. For instance, see here -- you need to prepare your data into a dataset in advance.

Finally, as per our issues guidelines, we reserve GitHub issues for bugs in the repository and/or feature requests. For any other matters, we'd like to invite you to use our forum 🤗

gante avatar Aug 10 '22 15:08 gante

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Sep 03 '22 15:09 github-actions[bot]