llm-foundry
                                
                                 llm-foundry copied to clipboard
                                
                                    llm-foundry copied to clipboard
                            
                            
                            
                        LLM training code for Databricks foundation models
There is a small bug the in the repo's RMSNorm implementation: https://github.com/mosaicml/llm-foundry/blob/7abae0d079e8c99a8f1eb73861b65e6766e45018/llmfoundry/models/layers/norm.py#L54-L58 Since `torch.rsqrt` (reciprocal sqrt) is used instead of `torch.sqrt`, `x` should be _multiplied_ by the result instead of...
A100+40G 
Finetuning with LoRA issue "TypeError: forward() got an unexpected keyword argument 'inputs_embeds'"
Hi! I am trying to finetune MPT-7B with LoRA configurations. # Model model_name = "mosaicml/mpt-7b" config = transformers.AutoConfig.from_pretrained( model_name, trust_remote_code=True ) config.update({"max_seq_len": 4096}) model = AutoModelForCausalLM.from_pretrained( model_name, trust_remote_code=True, torch_dtype=bfloat16, device_map='auto',...
Running ``` python eval/eval.py eval/yamls/hf_eval.yaml icl_tasks=eval/yamls/winograd.yaml model_name_or_path=gpt2 ``` fails with ``` RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)` ``` The same command runs fine with other models (e.g., `EleutherAI/gpt-neo-125M`). Any...
## 🚀 Feature Request Support for custom tokenizers that require `trust_remote_code=True` (e.g., the Replit tokenizer). ## Motivation Currently, when using a tokenizer that requires `trust_remote_code=True`, the scripts `train.py` and `convert_composer_to_hf.py`...
In your training config, we can choose data_local or data_remote If I'm using data_remote on s3, what option I will get? Training loop directly from remote s3? Or first transfer...
I have try to implement MPT-7b-chat based on MosaicML platform. I have executed the first step to convert c4 data set to steaming type and store my shards files on...
The README confused 2 interns, and the instructions every turn was killing me
Can you please provide an inference script as well for the onnx model... as i can't figure out a way to run the model... After i converted the model to...
Adding `print_example_icl_example: True` to `hf_eval.yaml` outputs: ```Example: ---------- Question: How do you prepare fresh chicken eggs? Answer: Use warm water, not cold water. Warm water can cause the contents of...