NeMo icon indicating copy to clipboard operation
NeMo copied to clipboard

Fix mixtral_nemo_to_hf conversion

Open tdene opened this issue 10 months ago • 3 comments

What does this PR do ?

Fix convert_mixtral_nemo_to_hf.py

Collection: scripts/checkpoint_converters

Changelog

  • Fix two typos
  • Allow for custom tokenizer to be passed in

PR Type:

  • [ ] New Feature
  • [x] Bugfix
  • [ ] Documentation

Who can review?

@cuichenx

Anyone in the community is free to review the PR once the checks have passed. Contributor guidelines contains specific people who can review PRs to various areas.

tdene avatar Apr 23 '24 17:04 tdene

Is it fine & really required to introduce extra param --input_tokenizer?

If yes, should we update relevant docs? https://github.com/NVIDIA/NeMo/blob/main/docs/source/ckpt_converters/dev_guide.rst

CC @yaoyu-33 - please have a look, maybe you will have some comments

janekl avatar Apr 24 '24 09:04 janekl

Is it fine & really required to introduce extra param --input_tokenizer?

I'm modeling this off #8000. In both cases, the problem is that the converter tries to use the default HuggingFace tokenizer, while the custom model that's being converted may well be using a different tokenizer.

Perhaps the correct solution in both cases is to unpack the tokenizer out of the .nemo file during the conversion script, instead of adding a --input_tokenizer argument.

tdene avatar Apr 24 '24 11:04 tdene

This PR is stale because it has been open for 14 days with no activity. Remove stale label or comment or update or this will be closed in 7 days.

github-actions[bot] avatar May 09 '24 01:05 github-actions[bot]

This PR was closed because it has been inactive for 7 days since being marked as stale.

github-actions[bot] avatar May 16 '24 01:05 github-actions[bot]