NeMo Regarding the ready to use .nemo models for PEFT finetuning

I am planning to finetune the llama model using the PEFT technique according to this official documentation https://docs.nvidia.com/nemo-framework/user-guide/latest/playbooks/llama2peft.html However, I am facing some issues when I convert the huggingface model to .nemo format.

So is there any repository of ready-to-use .nemo models(llama.nemo, mistral.nemo, openchat.nemo....etc) that are compatible with PEFT training steps mentioned in this official documentation?

If so I can skip the huggingface to .nemo conversion step and resume the next steps of fine-tuning @okuchaiev

Apr 02 '24 09:04 pradeepdev-1995

I'm having the same problem for mistral 7B PEFT when running this command: python3 /opt/NeMo/scripts/checkpoint_converters/convert_mistral_7b_hf_to_nemo.py --input_name_or_path=/workspace/mistral-7B-hf --output_path=mistral.nemo

This is the error: [NeMo I 2024-04-02 17:52:08 convert_mistral_7b_hf_to_nemo:149] loading checkpoint 1: /workspace/mistral-7B-hf in_dir: /workspace/mistral-7B-hf Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Traceback (most recent call last): File "/opt/NeMo/scripts/checkpoint_converters/convert_mistral_7b_hf_to_nemo.py", line 339, in <module> convert(args) File "/opt/NeMo/scripts/checkpoint_converters/convert_mistral_7b_hf_to_nemo.py", line 151, in convert model_args, ckpt, tokenizer = load_mistral_ckpt(args.input_name_or_path) File "/opt/NeMo/scripts/checkpoint_converters/convert_mistral_7b_hf_to_nemo.py", line 140, in load_mistral_ckpt model = AutoModelForCausalLM.from_pretrained(in_dir) File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained return model_class.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3671, in from_pretrained ) = cls._load_pretrained_model( File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 4078, in _load_pretrained_model state_dict = load_state_dict(shard_file, is_quantized=is_quantized) File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 507, in load_state_dict with safe_open(checkpoint_file, framework="pt") as f: FileNotFoundError: No such file or directory: "/workspace/mistral-7B-hf/model-00001-of-00002.safetensors"

I also tried to do fine tuning of llama2 7B but it gave me the same error, I cannot convert the checkpoints to .nemo format because the checkpoints cannot be loaded

Apr 02 '24 17:04 frankh077

@frankh077 Could you able to run the docker command docker run --gpus device=1 --shm-size=2g --net=host --ulimit memlock=-1 --rm -it -v ${PWD}:/workspace -w /workspace -v ${PWD}/results:/results nvcr.io/ea-bignlp/ga-participants/nemofw-training:23.08.03 bash specified in the official documentation? - https://docs.nvidia.com/nemo-framework/user-guide/latest/playbooks/llama2peft.html#step-2-convert-to-nemo

Apr 03 '24 06:04 pradeepdev-1995

@pradeepdev-1995 Yes but through the nemo-framework-training container, this is the command I used:

docker run --gpus device=1 --shm-size=2g --net=host --ulimit memlock=-1 --rm -it -v ${PWD}:/workspace -w /workspace -v ${PWD}/results:/results nvcr.io/nvaie/nemo-framework-training:23.08.03 bash

Apr 03 '24 13:04 frankh077

is this container setup is mandatory? @frankh077 Shall we do the fine-tuning directly in the Python console without using the container?

Apr 03 '24 13:04 pradeepdev-1995

I think it is mandatory since the environment and the necessary tools are in the container, but if you can build the environment it should work, you can base on the NeMo containers available through the NGC

Apr 03 '24 14:04 frankh077

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.

May 04 '24 01:05 github-actions[bot]

This issue was closed because it has been inactive for 7 days since being marked as stale.

May 11 '24 01:05 github-actions[bot]

NeMo NeMo copied to clipboard

Regarding the ready to use .nemo models for PEFT finetuning

NeMo
NeMo copied to clipboard