NanoCode012
NanoCode012
Oh, I wasn’t aware of that model. I thought they were referencing models such as https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct/tree/main which is llama based
@ajinkya123-robo , in the meantime, you can just use AutoCausalModelLM and AutoTokenizer with an existing config and point to your model (?). Unfortunately, in this case, sample packing isn't available...
Seems like they provided a patch for llama in their repo. Parts I've noticed: - during merge: need to load this ```py trainable_params = os.path.join(args.peft_model, "trainable_params.bin") if os.path.isfile(trainable_params): model.load_state_dict(torch.load(trainable_params, map_location=model.device),...
can you share config
Just one caveat to this, an older issue wanted the HF models+datasets to be downloaded to the volume. If you change the above, the user should override these values: https://github.com/OpenAccess-AI-Collective/axolotl/blob/89134f2143cd3325802813eb97cd05c783932201/docker/Dockerfile-cloud#L4-L7
Just another insight. ```yaml sample_packing: true ``` Try enabling this to see the improvements.
Closing as loss issue solved
zero3 used to have offload, but was removed. If you want to make a PR, you could re-add the old revision for zero3 but with a slightly different name like...
Added a PR #1466 for this.
May I ask if docker works? Alternatively, could you try create a new conda/pip venv again?