daniel, chen

Results 8 comments of daniel, chen

i will work on it

when i use convert shell script in your commit, It shows "Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/workspace/models/Llama-2-7b-hf/tokenizer.model'. Use `repo_type` argument if needed." error, Do you...

and I found, Using "TOKENIZER_MODEL=meta-llama/Llama-2-7b-hf" in shell script can convert hf to megatron successfully.

When i repeat this issue, and use "--use-legacy-models" in script. An error "File "/workspace/Megatron-LM/megatron/training/arguments.py", line 576, in validate_args raise RuntimeError('--use-dist-ckpt is not supported in legacy models.') RuntimeError: --use-dist-ckpt is not...

OK,thank you for reply,I will test it at the next Monday

i think article,https://www.deepspeed.ai/tutorials/megatron/, is useful. deepspeed ZeRO 1/2 works with Megatron-lm latest code.

I also look for such example~.