NanoCode012

Results 163 comments of NanoCode012

Oh, I wasn’t aware of that model. I thought they were referencing models such as https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct/tree/main which is llama based

@ajinkya123-robo , in the meantime, you can just use AutoCausalModelLM and AutoTokenizer with an existing config and point to your model (?). Unfortunately, in this case, sample packing isn't available...

Seems like they provided a patch for llama in their repo. Parts I've noticed: - during merge: need to load this ```py trainable_params = os.path.join(args.peft_model, "trainable_params.bin") if os.path.isfile(trainable_params): model.load_state_dict(torch.load(trainable_params, map_location=model.device),...

can you share config

Just one caveat to this, an older issue wanted the HF models+datasets to be downloaded to the volume. If you change the above, the user should override these values: https://github.com/OpenAccess-AI-Collective/axolotl/blob/89134f2143cd3325802813eb97cd05c783932201/docker/Dockerfile-cloud#L4-L7

Just another insight. ```yaml sample_packing: true ``` Try enabling this to see the improvements.

zero3 used to have offload, but was removed. If you want to make a PR, you could re-add the old revision for zero3 but with a slightly different name like...

May I ask if docker works? Alternatively, could you try create a new conda/pip venv again?