transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Why there is no data send to data_collator?

Open Luoyang144 opened this issue 2 years ago • 2 comments

System Info

  • transformers version: 4.27.0.dev0
  • Platform: Linux-3.10.0-514.el7.x86_64-x86_64-with-centos-7.3.1611-Core
  • Python version: 3.7.13
  • Huggingface_hub version: 0.12.0
  • PyTorch version (GPU?): 1.13.1+cu117 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using GPU in script?:
  • Using distributed or parallel set-up in script?:

Who can help?

@sgugger

Information

  • [ ] The official example scripts
  • [X] My own modified scripts

Tasks

  • [ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [X] My own task or dataset (give details below)

Reproduction

I'm tring to us code below

And get error KeyError: 'seq2seq', after output feature input to data_collator, get output like follow: [{}, {}, {}, {}] Why this happened? I output train dataset and get correct result, but when using trainer.train() it can get data I need. train.txt

Expected behavior

How can I send data in training process? Thanks for your help.

Luoyang144 avatar Mar 22 '23 03:03 Luoyang144

That's related to the data you are preprocessing, not the Transformers library or its examples. There is simply no "seq2seq2" in the features you prepare with your function. I suggest posting on the forums to get help from the larger community.

sgugger avatar Mar 22 '23 12:03 sgugger

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Apr 21 '23 15:04 github-actions[bot]