flair icon indicating copy to clipboard operation
flair copied to clipboard

Fine-tuning t5-base model raises an error

Open krzysztoffiok opened this issue 4 years ago • 9 comments

Hi,

I tried to fine-tune T5-base model on google colab and get this error

ValueError("You have to specify either decoder_input_ids or decoder_inputs_embeds")

To be more specific where the error happens, it happens at the very moment when the training should start:

2020-06-03 12:07:25,877 ---------------------------------------------------------------------------------------------------- 2020-06-03 12:07:25,877 Corpus: "Corpus: 4800 train + 1200 dev + 20630 test sentences" 2020-06-03 12:07:25,878 ---------------------------------------------------------------------------------------------------- 2020-06-03 12:07:25,878 Parameters: 2020-06-03 12:07:25,878 - learning_rate: "3e-06" 2020-06-03 12:07:25,879 - mini_batch_size: "8" 2020-06-03 12:07:25,879 - patience: "3" 2020-06-03 12:07:25,879 - anneal_factor: "0.5" 2020-06-03 12:07:25,880 - max_epochs: "4" 2020-06-03 12:07:25,880 - shuffle: "True" 2020-06-03 12:07:25,880 - train_with_dev: "False" 2020-06-03 12:07:25,880 - batch_growth_annealing: "False" 2020-06-03 12:07:25,880 ---------------------------------------------------------------------------------------------------- 2020-06-03 12:07:25,880 Model training base path: "semeval_data/model_sentiment_0" 2020-06-03 12:07:25,880 ---------------------------------------------------------------------------------------------------- 2020-06-03 12:07:25,880 Device: cuda:0 2020-06-03 12:07:25,881 ---------------------------------------------------------------------------------------------------- 2020-06-03 12:07:25,881 Embeddings storage mode: cpu 2020-06-03 12:07:25,883 ---------------------------------------------------------------------------------------------------- Traceback (most recent call last): File "./model_train.py", line 138, in shuffle=True, File "/usr/local/lib/python3.6/dist-packages/flair/trainers/trainer.py", line 349, in train loss = self.model.forward_loss(batch_step) File "/usr/local/lib/python3.6/dist-packages/flair/models/text_classification_model.py", line 142, in forward_loss scores = self.forward(data_points) File "/usr/local/lib/python3.6/dist-packages/flair/models/text_classification_model.py", line 98, in forward self.document_embeddings.embed(sentences) File "/usr/local/lib/python3.6/dist-packages/flair/embeddings/base.py", line 59, in embed self._add_embeddings_internal(sentences) File "/usr/local/lib/python3.6/dist-packages/flair/embeddings/document.py", line 91, in _add_embeddings_internal self._add_embeddings_to_sentences(batch) File "/usr/local/lib/python3.6/dist-packages/flair/embeddings/document.py", line 136, in _add_embeddings_to_sentences else self.model(input_ids)[-1] File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_t5.py", line 955, in forward use_cache=use_cache, File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 550, in call result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.6/dist-packages/transformers/modeling_t5.py", line 674, in forward raise ValueError("You have to specify either decoder_input_ids or decoder_inputs_embeds") ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds

To Reproduce Go to google colab, create a new project with gpu and do the following: !git clone https://github.com/krzysztoffiok/twitter_sentiment !pip3 install flair !pip3 install datatable

cd twitter_sentiment

!python3 ./semeval_data_splitter.py !python3 ./model_train.py --dataset=semeval --k_folds=5 --test_run=t5-base --fine_tune

Expected behavior the script should start training (fine tuning) a list of models, the first given is t5-base

Environment (please complete the following information): google colab GPU runtime

!nvidia-smi Wed Jun 3 12:11:08 2020
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 440.82 Driver Version: 418.67 CUDA Version: 10.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 Tesla K80 Off | 00000000:00:04.0 Off | 0 | | N/A 36C P8 26W / 149W | 0MiB / 11441MiB | 0% Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+

!pip3 freeze returns:

absl-py==0.9.0 alabaster==0.7.12 albumentations==0.1.12 altair==4.1.0 asgiref==3.2.7 astor==0.8.1 astropy==4.0.1.post1 astunparse==1.6.3 atari-py==0.2.6 atomicwrites==1.4.0 attrs==19.3.0 audioread==2.1.8 autograd==1.3 Babel==2.8.0 backcall==0.1.0 beautifulsoup4==4.6.3 bleach==3.1.5 blessed==1.17.6 blis==0.4.1 bokeh==1.4.0 boto==2.49.0 boto3==1.13.19 botocore==1.16.19 Bottleneck==1.3.2 bpemb==0.3.0 branca==0.4.1 bs4==0.0.1 CacheControl==0.12.6 cachetools==3.1.1 catalogue==1.0.0 certifi==2020.4.5.1 cffi==1.14.0 chainer==6.5.0 chardet==3.0.4 click==7.1.2 cloudpickle==1.3.0 cmake==3.12.0 cmdstanpy==0.4.0 colorama==0.4.3 colorlover==0.3.0 community==1.0.0b1 contextlib2==0.5.5 convertdate==2.2.1 coverage==3.7.1 coveralls==0.5 crcmod==1.7 cufflinks==0.17.3 cupy-cuda101==6.5.0 cvxopt==1.2.5 cvxpy==1.0.31 cycler==0.10.0 cymem==2.0.3 Cython==0.29.19 daft==0.0.4 dask==2.12.0 dataclasses==0.7 datascience==0.10.6 datatable==0.10.1 decorator==4.4.2 defusedxml==0.6.0 Deprecated==1.2.10 descartes==1.1.0 dill==0.3.1.1 distributed==1.25.3 Django==3.0.6 dlib==19.18.0 docopt==0.6.2 docutils==0.15.2 dopamine-rl==1.0.5 earthengine-api==0.1.223 easydict==1.9 ecos==2.0.7.post1 editdistance==0.5.3 en-core-web-sm==2.2.5 entrypoints==0.3 ephem==3.7.7.1 et-xmlfile==1.0.1 fa2==0.3.5 fancyimpute==0.4.3 fastai==1.0.61 fastdtw==0.3.4 fastprogress==0.2.3 fastrlock==0.4 fbprophet==0.6 feather-format==0.4.1 featuretools==0.4.1 filelock==3.0.12 firebase-admin==4.1.0 fix-yahoo-finance==0.0.22 flair==0.5 Flask==1.1.2 folium==0.8.3 fsspec==0.7.4 future==0.16.0 gast==0.3.3 GDAL==2.2.2 gdown==3.6.4 gensim==3.6.0 geographiclib==1.50 geopy==1.17.0 gin-config==0.3.0 glob2==0.7 google==2.0.3 google-api-core==1.16.0 google-api-python-client==1.7.12 google-auth==1.7.2 google-auth-httplib2==0.0.3 google-auth-oauthlib==0.4.1 google-cloud-bigquery==1.21.0 google-cloud-core==1.0.3 google-cloud-datastore==1.8.0 google-cloud-firestore==1.7.0 google-cloud-language==1.2.0 google-cloud-storage==1.18.1 google-cloud-translate==1.5.0 google-colab==1.0.0 google-pasta==0.2.0 google-resumable-media==0.4.1 googleapis-common-protos==1.51.0 googledrivedownloader==0.4 graphviz==0.10.1 grpcio==1.29.0 gspread==3.0.1 gspread-dataframe==3.0.7 gym==0.17.2 h5py==2.10.0 HeapDict==1.0.1 holidays==0.9.12 html5lib==1.0.1 httpimport==0.5.18 httplib2==0.17.4 httplib2shim==0.0.3 humanize==0.5.1 hyperopt==0.1.2 ideep4py==2.0.0.post3 idna==2.9 image==1.5.32 imageio==2.4.1 imagesize==1.2.0 imbalanced-learn==0.4.3 imblearn==0.0 imgaug==0.2.9 importlib-metadata==1.6.0 imutils==0.5.3 inflect==2.1.0 intel-openmp==2020.0.133 intervaltree==2.1.0 ipykernel==4.10.1 ipython==5.5.0 ipython-genutils==0.2.0 ipython-sql==0.3.9 ipywidgets==7.5.1 itsdangerous==1.1.0 jax==0.1.68 jaxlib==0.1.47 jdcal==1.4.1 jedi==0.17.0 jieba==0.42.1 Jinja2==2.11.2 jmespath==0.10.0 joblib==0.15.1 jpeg4py==0.1.4 jsonschema==2.6.0 jupyter==1.0.0 jupyter-client==5.3.4 jupyter-console==5.2.0 jupyter-core==4.6.3 kaggle==1.5.6 kapre==0.1.3.1 Keras==2.3.1 Keras-Applications==1.0.8 Keras-Preprocessing==1.1.2 keras-vis==0.4.1 kiwisolver==1.2.0 knnimpute==0.1.0 langdetect==1.0.8 librosa==0.6.3 lightgbm==2.2.3 llvmlite==0.31.0 lmdb==0.98 lucid==0.3.8 LunarCalendar==0.0.9 lxml==4.2.6 Markdown==3.2.2 MarkupSafe==1.1.1 matplotlib==3.2.1 matplotlib-venn==0.11.5 missingno==0.4.2 mistune==0.8.4 mizani==0.6.0 mkl==2019.0 mlxtend==0.14.0 more-itertools==8.3.0 moviepy==0.2.3.5 mpld3==0.3 mpmath==1.1.0 msgpack==1.0.0 multiprocess==0.70.9 multitasking==0.0.9 murmurhash==1.0.2 music21==5.5.0 natsort==5.5.0 nbconvert==5.6.1 nbformat==5.0.6 networkx==2.4 nibabel==3.0.2 nltk==3.2.5 notebook==5.2.2 np-utils==0.5.12.1 numba==0.48.0 numexpr==2.7.1 numpy==1.18.4 nvidia-ml-py3==7.352.0 oauth2client==4.1.3 oauthlib==3.1.0 okgrade==0.4.3 opencv-contrib-python==4.1.2.30 opencv-python==4.1.2.30 openpyxl==2.5.9 opt-einsum==3.2.1 osqp==0.6.1 packaging==20.4 palettable==3.3.0 pandas==1.0.4 pandas-datareader==0.8.1 pandas-gbq==0.11.0 pandas-profiling==1.4.1 pandocfilters==1.4.2 parso==0.7.0 pathlib==1.0.1 patsy==0.5.1 pexpect==4.8.0 pickleshare==0.7.5 Pillow==7.0.0 pip-tools==4.5.1 plac==1.1.3 plotly==4.4.1 plotnine==0.6.0 pluggy==0.13.1 portpicker==1.3.1 prefetch-generator==1.0.1 preshed==3.0.2 prettytable==0.7.2 progressbar2==3.38.0 prometheus-client==0.8.0 promise==2.3 prompt-toolkit==1.0.18 protobuf==3.10.0 psutil==5.4.8 psycopg2==2.7.6.1 ptvsd==5.0.0a12 ptyprocess==0.6.0 py==1.8.1 pyarrow==0.14.1 pyasn1==0.4.8 pyasn1-modules==0.2.8 pycocotools==2.0.0 pycparser==2.20 pydata-google-auth==1.1.0 pydot==1.3.0 pydot-ng==2.0.0 pydotplus==2.0.2 PyDrive==1.3.1 pyemd==0.5.1 pyglet==1.5.0 Pygments==2.1.3 pygobject==3.26.1 pymc3==3.7 PyMeeus==0.3.7 pymongo==3.10.1 pymystem3==0.2.0 PyOpenGL==3.1.5 pyparsing==2.4.7 pyrsistent==0.16.0 pysndfile==1.3.8 PySocks==1.7.1 pystan==2.19.1.1 pytest==5.4.3 python-apt==1.6.5+ubuntu0.2 python-chess==0.23.11 python-dateutil==2.8.1 python-louvain==0.14 python-slugify==4.0.0 python-utils==2.4.0 pytz==2018.9 PyWavelets==1.1.1 PyYAML==3.13 pyzmq==19.0.1 qtconsole==4.7.4 QtPy==1.9.0 regex==2019.12.20 requests==2.23.0 requests-oauthlib==1.3.0 resampy==0.2.2 retrying==1.3.3 rpy2==3.2.7 rsa==4.0 s3fs==0.4.2 s3transfer==0.3.3 sacremoses==0.0.43 scikit-image==0.16.2 scikit-learn==0.22.2.post1 scipy==1.4.1 screen-resolution-extra==0.0.0 scs==2.1.2 seaborn==0.10.1 segtok==1.5.10 Send2Trash==1.5.0 sentencepiece==0.1.91 setuptools-git==1.2 Shapely==1.7.0 simplegeneric==0.8.1 six==1.12.0 sklearn==0.0 sklearn-pandas==1.8.0 smart-open==2.0.0 snowballstemmer==2.0.0 sortedcontainers==2.1.0 spacy==2.2.4 Sphinx==1.8.5 sphinxcontrib-websupport==1.2.2 SQLAlchemy==1.3.17 sqlitedict==1.6.0 sqlparse==0.3.1 srsly==1.0.2 statsmodels==0.10.2 sympy==1.1.1 tables==3.4.4 tabulate==0.8.7 tbb==2020.0.133 tblib==1.6.0 tensorboard==2.2.2 tensorboard-plugin-wit==1.6.0.post3 tensorboardcolab==0.0.22 tensorflow==2.2.0 tensorflow-addons==0.8.3 tensorflow-datasets==2.1.0 tensorflow-estimator==2.2.0 tensorflow-gcs-config==2.1.8 tensorflow-hub==0.8.0 tensorflow-metadata==0.22.1 tensorflow-privacy==0.2.2 tensorflow-probability==0.10.0 termcolor==1.1.0 terminado==0.8.3 testpath==0.4.4 text-unidecode==1.3 textblob==0.15.3 textgenrnn==1.4.1 Theano==1.0.4 thinc==7.4.0 tifffile==2020.5.30 tokenizers==0.7.0 toolz==0.10.0 torch==1.5.0+cu101 torchsummary==1.5.1 torchtext==0.3.1 torchvision==0.6.0+cu101 tornado==4.5.3 tqdm==4.41.1 traitlets==4.3.3 transformers==2.11.0 tweepy==3.6.0 typeguard==2.7.1 typesentry==0.2.7 typing==3.6.6 typing-extensions==3.6.6 tzlocal==1.5.1 umap-learn==0.4.3 uritemplate==3.0.1 urllib3==1.24.3 vega-datasets==0.8.0 wasabi==0.6.0 wcwidth==0.1.9 webencodings==0.5.1 Werkzeug==1.0.1 widgetsnbextension==3.5.1 wordcloud==1.5.0 wrapt==1.12.1 xarray==0.15.1 xgboost==0.90 xkit==0.0.0 xlrd==1.1.0 xlwt==1.3.0 yellowbrick==0.9.1 zict==2.0.0 zipp==3.1.0

krzysztoffiok avatar Jun 03 '20 12:06 krzysztoffiok

Did you solve the error? I am also facing the same bug

nightlessbaron avatar Aug 01 '20 14:08 nightlessbaron

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Nov 29 '20 15:11 stale[bot]

I have a working solution for it, will prepare a PR for that soon, so re-opening it!

stefan-it avatar Apr 06 '22 14:04 stefan-it

I have a working solution for it, will prepare a PR for that soon, so re-opening it!

Hi @stefan-it , is there any updates on the PR? error still persists

ataniz avatar Jul 18 '22 10:07 ataniz

I am facing the same issue for mt5-small one.. Can anyone fix this if yes please your guidance is always welcome.. Thanks in advance.

Madhu000 avatar Aug 07 '22 19:08 Madhu000

Hi @ataniz and @Madhu000 ,

sorry for the late reply! I pushed a working version of encoder-only fine-tuning T5 models:

https://github.com/flairNLP/flair/pull/2896

Feel free to test it :hugs:

stefan-it avatar Aug 08 '22 10:08 stefan-it

When I am testing with this branch the same error is occurring.. Please help me out. Thanks in advance. Please find the following log

2022-08-08 20:25:36,116

2022-08-08 20:25:36,117 Corpus: "MultiCorpus: 644 train + 92 dev + 186 test sentences - ColumnCorpus Corpus: 644 train + 92 dev + 186 test sentences - /root/.flair/datasets/ner_masakhane/luo" 2022-08-08 20:25:36,117

2022-08-08 20:25:36,117 Parameters: 2022-08-08 20:25:36,117 - learning_rate: "0.000050" 2022-08-08 20:25:36,117 - mini_batch_size: "4" 2022-08-08 20:25:36,117 - patience: "3" 2022-08-08 20:25:36,117 - anneal_factor: "0.5" 2022-08-08 20:25:36,117 - max_epochs: "10" 2022-08-08 20:25:36,117 - shuffle: "True" 2022-08-08 20:25:36,117 - train_with_dev: "False" 2022-08-08 20:25:36,118 - batch_growth_annealing: "False" 2022-08-08 20:25:36,118

2022-08-08 20:25:36,118 Model training base path: "conll-03-t5-base" 2022-08-08 20:25:36,118

2022-08-08 20:25:36,118 Device: cuda:0 2022-08-08 20:25:36,118

2022-08-08 20:25:36,118 Embeddings storage mode: none 2022-08-08 20:25:36,118

Traceback (most recent call last): File "run_ner.py", line 158, in main() File "run_ner.py", line 147, in main weight_decay=training_args.weight_decay, File "/usr/local/lib/python3.7/dist-packages/flair/trainers/trainer.py", line 909, in fine_tune **trainer_args, File "/usr/local/lib/python3.7/dist-packages/flair/trainers/trainer.py", line 500, in train loss = self.model.forward_loss(batch_step) File "/usr/local/lib/python3.7/dist-packages/flair/models/sequence_tagger_model.py", line 270, in forward_loss scores, gold_labels = self.forward(sentences) # type: ignore File "/usr/local/lib/python3.7/dist-packages/flair/models/sequence_tagger_model.py", line 282, in forward self.embeddings.embed(sentences) File "/usr/local/lib/python3.7/dist-packages/flair/embeddings/base.py", line 62, in embed self._add_embeddings_internal(data_points) File "/usr/local/lib/python3.7/dist-packages/flair/embeddings/base.py", line 766, in _add_embeddings_internal self._add_embeddings_to_sentences(expanded_sentences) File "/usr/local/lib/python3.7/dist-packages/flair/embeddings/base.py", line 692, in _add_embeddings_to_sentences hidden_states = self.model(input_ids, **model_kwargs)[-1] File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/transformers/models/t5/modeling_t5.py", line 1438, in forward return_dict=return_dict, File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/transformers/models/t5/modeling_t5.py", line 932, in forward raise ValueError(f"You have to specify either {err_msg_prefix}input_ids or {err_msg_prefix}inputs_embeds") ValueError: You have to specify either decoder_input_ids or decoder_inputs_embeds

On Mon, Aug 8, 2022 at 4:27 PM Stefan Schweter @.***> wrote:

Hi @ataniz https://github.com/ataniz and @Madhu000 https://github.com/Madhu000 ,

sorry for the late reply! I pushed a working version of encoder-only fine-tuning T5 models:

#2896 https://github.com/flairNLP/flair/pull/2896

Feel free to test it 🤗

— Reply to this email directly, view it on GitHub https://github.com/flairNLP/flair/issues/1661#issuecomment-1207972418, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKCHYN4O7TPV3EMWRMUATOLVYDR2BANCNFSM4NRTN36Q . You are receiving this because you were mentioned.Message ID: @.***>

Madhu000 avatar Aug 08 '22 20:08 Madhu000

Hi @Madhu000 ,

it seems that Flair in your virtual environment uses the installed 0.11 version (this can be seen in the logs, because flair/embeddings/base.py do not have a line 692 in latest master due to a recent refactoring). Here's a short snippet of how to use the T5 encoder fix branch:

pip3 uninstall flair

git clone https://github.com/flairNLP/flair.git
cd flair
git checkout add-t5-encoder-support
pip3 install -e .

Then you can try using it again :)

stefan-it avatar Aug 08 '22 20:08 stefan-it

Thanks, I'll check it out.

On Tue, Aug 9, 2022 at 2:17 AM Stefan Schweter @.***> wrote:

Hi @Madhu000 https://github.com/Madhu000 ,

it seems that Flair in your virtual environment uses the installed 0.11 version (this can be seen in the logs, because flair/embeddings/base.py do not have a line 692 in latest master https://github.com/flairNLP/flair/blob/master/flair/embeddings/base.py due to a recent refactoring). Here's a short snippet of how to use the T5 encoder fix branch:

pip3 uninstall flair

git clone https://github.com/flairNLP/flair.gitcd flair git checkout add-t5-encoder-support pip3 install -e .

Then you can try using it again :)

— Reply to this email directly, view it on GitHub https://github.com/flairNLP/flair/issues/1661#issuecomment-1208594516, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKCHYN2TZSNBWVGYONIMVJ3VYFW5LANCNFSM4NRTN36Q . You are receiving this because you were mentioned.Message ID: @.***>

Madhu000 avatar Aug 08 '22 20:08 Madhu000

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Dec 24 '22 06:12 stale[bot]