transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Protobuf 4 support

Open RobinKa opened this issue 2 years ago • 4 comments

Feature request

Currently transformers requires protobuf 3 or lower

https://github.com/huggingface/transformers/blob/a8eb4f79f946c5785f0e91b356ce328248916a05/setup.py#L141

Support for version 4 should be added.

Motivation

Some Python packages only work with protobuf 4 so transformers is incompatible with them (for example flytekit >= 1.3).

Your contribution

RobinKa avatar Feb 17 '23 12:02 RobinKa

Last time we checked, protobuf>=4 was blowing up sentencepiece entirely, which is a dependency we really need in Transformers. I don't know if that has been fixed since then, maybe @ydshieh could check when he has some time?

sgugger avatar Feb 17 '23 16:02 sgugger

Running T5 tokenization tests gets a lot of failure (T5 tokenizer use sentencepiece ), if I use protobuf==4.22.0

Also, I see the following conflict when I installed latest protobuf.

tensorflow 2.11.0 requires protobuf<3.20,>=3.9.2, but you have protobuf 4.22.0 which is incompatible.
tensorboardx 2.5.1 requires protobuf<=3.20.1,>=3.8.0, but you have protobuf 4.22.0 which is incompatible.
tensorboard 2.11.1 requires protobuf<4,>=3.9.2, but you have protobuf 4.22.0 which is incompatible.
ray 2.0.0 requires protobuf<4.0.0,>=3.15.3, but you have protobuf 4.22.0 which is incompatible.
onnx 1.12.0 requires protobuf<=3.20.1,>=3.12.2, but you have protobuf 4.22.0 which is incompatible.

If tensorflow is installed in this case, even pytorch tests will fail, as there is

  File "/home/huggingface/transformers-hf-gcp/src/transformers/trainer_utils.py", line 47, in <module>
    import tensorflow as tf

ydshieh avatar Feb 17 '23 18:02 ydshieh

So it looks like lots of libraries in our soft dependencies do not support protobuf 4 yet. We won't be able to offer support either until they do :-)

sgugger avatar Feb 20 '23 08:02 sgugger