NeMo
NeMo copied to clipboard
Fix mixtral_nemo_to_hf conversion
What does this PR do ?
Fix convert_mixtral_nemo_to_hf.py
Collection: scripts/checkpoint_converters
Changelog
- Fix two typos
- Allow for custom tokenizer to be passed in
PR Type:
- [ ] New Feature
- [x] Bugfix
- [ ] Documentation
Who can review?
@cuichenx
Anyone in the community is free to review the PR once the checks have passed. Contributor guidelines contains specific people who can review PRs to various areas.
Is it fine & really required to introduce extra param --input_tokenizer
?
If yes, should we update relevant docs? https://github.com/NVIDIA/NeMo/blob/main/docs/source/ckpt_converters/dev_guide.rst
CC @yaoyu-33 - please have a look, maybe you will have some comments
Is it fine & really required to introduce extra param --input_tokenizer?
I'm modeling this off #8000. In both cases, the problem is that the converter tries to use the default HuggingFace tokenizer, while the custom model that's being converted may well be using a different tokenizer.
Perhaps the correct solution in both cases is to unpack the tokenizer out of the .nemo file during the conversion script, instead of adding a --input_tokenizer
argument.
This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.
This PR was closed because it has been inactive for 7 days since being marked as stale.