NanoCode012 comments

Results 180 comments of


                                            NanoCode012

Kindly add DeepSeek family for training

Oh, I wasn’t aware of that model. I thought they were referencing models such as https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct/tree/main which is llama based

Kindly add DeepSeek family for training

@ajinkya123-robo , in the meantime, you can just use AutoCausalModelLM and AutoTokenizer with an existing config and point to your model (?). Unfortunately, in this case, sample packing isn't available...

LongLora suport

Seems like they provided a patch for llama in their repo. Parts I've noticed: - during merge: need to load this ```py trainable_params = os.path.join(args.peft_model, "trainable_params.bin") if os.path.isfile(trainable_params): model.load_state_dict(torch.load(trainable_params, map_location=model.device),...

Missing YAML mlflow

can you share config

RunPod template not working with network volumes, /workspace/axolotl empty

Just one caveat to this, an older issue wanted the HF models+datasets to be downloaded to the volume. If you change the above, the user should override these values: https://github.com/OpenAccess-AI-Collective/axolotl/blob/89134f2143cd3325802813eb97cd05c783932201/docker/Dockerfile-cloud#L4-L7

Axolotl has significantly higher train loss, longer train time compare with my training script.

Just another insight. ```yaml sample_packing: true ``` Try enabling this to see the improvements.

Axolotl has significantly higher train loss, longer train time compare with my training script.

Closing as loss issue solved

Add example for deepspeed config with cpu offloading.

zero3 used to have offload, but was removed. If you want to make a PR, you could re-add the old revision for zero3 but with a slightly different name like...

Add example for deepspeed config with cpu offloading.

Added a PR #1466 for this.

pip install (as per docs) fails with ModuleNotFoundError: No module named 'axolotl'

May I ask if docker works? Alternatively, could you try create a new conda/pip venv again?