transformers icon indicating copy to clipboard operation
transformers copied to clipboard

Upgrading sentencepiece modeling file (for proto > 4 support).

Open Narsil opened this issue 1 year ago • 2 comments

What does this PR do?

Upgrades the file.

Taken from google/sentencepiece directly.

Should prevent "Downgrade the protobuf package".

Fixes # (issue)

Before submitting

  • [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [ ] Did you read the contributor guideline, Pull Request section?
  • [ ] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • [ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • [ ] Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

Narsil avatar Apr 26 '23 16:04 Narsil

The documentation is not available anymore as the PR was closed or merged.

Can you push an empty commit with "[all-test]" in the message? I'd like to see if there is a problem with an older version of protobuf (CI should be on < 4).

sgugger avatar Apr 26 '23 16:04 sgugger

I remember we said we should check if the protobuf version is installed somewhere, as we had some issues. Did it turn out to be ok?

ArthurZucker avatar Apr 28 '23 09:04 ArthurZucker

It turns out the file in https://github.com/google/sentencepiece is actually not even valid protobuf 4.x ...

Narsil avatar Apr 28 '23 09:04 Narsil

So this doesn't work with protobuf 4.x in the end?

sgugger avatar Apr 28 '23 12:04 sgugger

Nope. it passes with protobuf 3.20 in the tests, but not with 4.x.....

I think we should wait for them to upgrade before doing it ourselves (the .proto files are available so we could generate them with a 4.x compiler... but I don't like doing that.) As much as handling 4.x and 3.20 codebases is annoying I don't want to spend much time on this tbh.

I'll close this PR, we can resurrrect later maybe.

Narsil avatar Apr 28 '23 16:04 Narsil

Thanks for having tried!

sgugger avatar Apr 28 '23 16:04 sgugger