llm-foundry
llm-foundry copied to clipboard
LLM training code for Databricks foundation models
I have trained 125m MPT on some small dataset, my generated inputs via `inference/hf_generate.py` (before this converted from composer to HF, and it's gives me some value from `eval/eval.py` with...
i am trying to inference mosaicml/mpt-7b model on colab but got unexpected results ``` !python /content/llm-foundry/scripts/inference/hf_generate.py \ --name_or_path 'mosaicml/mpt-7b' \ --temperature 1.0 \ --top_p 0.95 \ --top_k 50 \ --seed...
I uncomment the following lines in https://github.com/mosaicml/llm-foundry/blob/main/scripts/train/finetune_example/gpt2-arc-easy.yaml save_num_checkpoints_to_keep: 1 # Important, this cleans up checkpoints saved to DISK save_folder: ./{run_name}/checkpoints Then run "composer train.py finetune_example/gpt2-arc-easy.yaml". I can see the saved...
Hi, I am trying to run the tests suite to see if my setup is correct and I am down to 31 failed, 4852 passed etc... However the ones that...
I wanna ask about training with slum command. I'm training 7b parameters model but apparently when i set the environment with more than one node it does see only one...
Changes almost identical to those in https://github.com/mosaicml/examples/pull/335

Not an issue, just a question. Does llm-foundry automatically handle eos tokens or should we manually add them into our text data to denote? For example, if we are loading...
Hi! you have script for prepare data in your `scripts/train` which is `python ../data_prep/convert_dataset_hf.py --dataset c4 --data_subset en --out_root ./my-copy-c4 --splits train_small val_small --concat_tokens 2048 --tokenizer EleutherAI/gpt-neox-20b --eos_text ''` can...