llm-foundry
llm-foundry copied to clipboard
LLM training code for Databricks foundation models
Hello Team, Can you please guide on how to finetune on local datasets, the instructions given in scripts/train are not so clear. Below yaml file was given as sample example:...
I see in the `train.py` under `scripts/train`, it gets a model when given a model configuration. I took a look at this yaml `7b_dolly_sft.yaml` and do you think I could...
Hi, Is there any option to convert HF ckpt to composer and FT with the llm-foundry scripts? Thanks!
Hey, I wanted to use the HF transformer library for the currently fastest inference possible (triton). I am trying to use https://github.com/mosaicml/llm-foundry/blob/main/scripts/inference/hf_generate.py ```sh python test.py --temperature 1.0 \ (transformers-direct) --name_or_path...
Hello, the model takes very long to load for some reason. The actual shard loading is very fast but the delay before is several minutes on a 5950x and 3090...
When running the training section of the `readme` i get an error regarding `cuda.h`. Is it possible to specify a path for the `composer` to look for the `cuda` support?...
• Storywriter is not commercially viable • 65k not 64k context window for storywriter
@vchiley @samhavens @alextrott16 , i was going through the MPT-7B model fine tuning documentation. It is def well written but quite hard to grasp in the first look. Therefore, I...
The hf_chat.py program emits this warning message before each chat response: The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior....
does it work on my local machine or is necessary in need GPU for it to run this model? I tried to load a model on my local machine with...