llm-foundry
llm-foundry copied to clipboard
LLM training code for Databricks foundation models
Bumps databricks-connect from 14.1.0 to 15.4.2. [](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a...
When using llm-foundry for model evaluation, multi-gpu mode does not work. The source code is here: https://github.com/mlfoundations/open_lm/blob/main/eval/eval_openlm_ckpt.py
## Environment python 3.11.9 cuda 11.8 torch 2.4.0+cu118 PyTorch information ------------------- PyTorch version: 2.4.0+cu118 Is debug build: False CUDA used to build PyTorch: 11.8 ROCM used to build PyTorch: N/A...
When i test public DCLM-7B(https://huggingface.co/apple/DCLM-7B) model on triviaqa small subset, the metrics is so low. Eval metrics/triviaqa_sm_sub/0-shot/InContextLearningGenerationExactMatchAccuracy: 0.0003 yaml config: label: triviaqa_sm_sub dataset_uri: eval/local_data/world_knowledge/triviaqa_sm_sub.jsonl num_fewshot: [0, 3, 5] icl_task_type: generation_task_with_answers...
Bumps [fsspec](https://github.com/fsspec/filesystem_spec) from 2023.6.0 to 2024.9.0. Commits 76ca4a6 Changelog (#1670) 7793ab8 Improve performance find zip archive (#1664) ee98ae3 Apply more ruff rules (#1660) 4f883ad Remove print statement (#1661) 4b79654 localfs:...
Bumps [tiktoken](https://github.com/openai/tiktoken) from 0.4.0 to 0.7.0. Changelog Sourced from tiktoken's changelog. [v0.7.0] Support for gpt-4o Performance improvements [v0.6.0] Optimise regular expressions for a 20% performance improvement, thanks to @paplorinc! Add...
Catches tokenization failures on custom HF datasets with missing/extra/mistyped columns.
## What changes are proposed in this pull request? See go/sweeps-eval for more context. Corresponding MAPI changes are contained in https://github.com/databricks-mosaic/mcloud/pull/4562. ## How is this tested? Added unit tests.  on bare metal (inside condo env) But I am getting following error - ``` Building CMake extension transformer_engine Running command /opt/conda/lib/python3.11/site-packages/cmake/data/bin/cmake -S /tmp/pip-req-build-eibanc7t/transformer_engine/common...
When I finetune MPT, the code is OK. But when I fine tune Llama I get the following error. ``` ----------Begin global rank 2 STDERR---------- 2024-09-02 20:12:15,331: rank2[3924][MainThread]: DEBUG: llmfoundry.command_utils.train:...