llm-foundry icon indicating copy to clipboard operation
llm-foundry copied to clipboard

LLM training code for Databricks foundation models

Results 267 llm-foundry issues
Sort by recently updated
recently updated
newest added

When attempting load a sharded checkpoint, we (@prigoyal and I) hit the following error: ``` 595 │ /usr/lib/python3/dist-packages/composer/utils/checkpoint.py:287 in │ 596 │ load_checkpoint │ 597 │ │ 598 │ 284...

bug

Previously, large files are read entirely at once via `file.read()`. This reads the file and tokenizes in chunks.

Here is code we use to test our benchmark tasks by using a series of progressively more advanced models to see if the benchmarks effectively differentiate between them, and at...

I want to pretrain an LLM with 2T tokens using llm-foundry. But before training, the data processing time is too long. Is there any way to accelerate it?

enhancement

Adding Big Bench Hard subset as a set of combined CoT tasks, formatted according to the specification in [this repo](https://github.com/suzgunmirac/BIG-Bench-Hard/tree/main). These tasks are quite large and quite slow. I don't...

I'm trying to implement DecoupledLionW_8bit in my fine-tuning script, but I get the following error: > ERROR: Could not find a version that satisfies the requirement mosaicml-turbo=0.0.2; extra == "gpu"...

question

Current path for streaming of finetuning datasets does not allow for streaming from local path (which works for text datasets out of the box and is also supported by `StreamingFinetuningDataset`...

The model should not be trained to predict the word after the eos_token, because it comes from a different sequence. This PR implements this logic. TODO: Experimental verification.

Implement F1 score for reference-based grading of QA tasks. This PR is dependent on Max's [refactor](https://github.com/mosaicml/composer/pull/2713) added quac, natural questions, and narrative qa Tested mpt-7b-instruct: ``` | Category | Benchmark...

Enable delta table as input for CPT For CPT, you need to provide some tokenizer arguments so the resulted MDS dataset can be written python scripts/data_prep/convert_delta_to_json.py --delta_table_name main.streaming.random_cpt_table --processes 128...