transformers icon indicating copy to clipboard operation
transformers copied to clipboard

fix bug auto loading llamatokenizer

Open vxfla opened this issue 1 year ago • 2 comments

What does this PR do?

Huggingface decapoda-research/llama-7b-hf config decide the name of tokenizer LLaMATokenizer, while in transformers it is LlamaTokenizer. Unify the name as LLaMATokenizer, so that we can use AutoTokenizer to load llama tokenizer.

Fixes # (issue)

Before submitting

  • [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [ ] Did you read the contributor guideline, Pull Request section?
  • [ ] Was this discussed/approved via a Github issue or the forum? Please add a link to it if that's the case.
  • [ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • [ ] Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

vxfla avatar Apr 25 '23 13:04 vxfla

Lol no, but nice try. Maybe decapoda-research/llama-7b-hf should merge one of the multiple PRs they received that fixes the tokenizer on their side.

sgugger avatar Apr 25 '23 13:04 sgugger

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar May 25 '23 15:05 github-actions[bot]