llm-foundry issues

ERROR:composer.cli.launcher:Global rank 0 (PID 208865) exited with code -11

1

When I am doing finetune llama3.1 does the following error occurs, can't locate the exact error, how to fix it please? Running environment: ``` Python 3.11.0rc1 GPU: 2xA100 CUDA Version:...

AndrewHYC

question

Set use_cache back to True for HF checkpointer

4

Most HF models have `use_cache` set to `True` by default, which is manually changed to `False` in llm-foundry (most likely due to https://github.com/huggingface/transformers/issues/28056). This PR sets `use_cache` back to True...

eldarkurtic

Byod/data validation

XiaohanZhangCMU

draft PR for validation notebook

XiaohanZhangCMU

Adding temperature tuning in attention

Adding temperature tuning in attention similar to https://github.com/huggingface/transformers/blob/9a4ce6477019358abc3ebd72d435da56f4c0ab7c/src/transformers/models/llama4/modeling_llama4.py#L332-L337

ShashankMosaicML

Adding support for nope positional encoding in block overrides.

ShashankMosaicML

Support interleaving dense and moe

ShashankMosaicML

memory bound and global_batch_size

1

Hello, I’m running a 7B model with a 32k context size and seeing unexpected memory scaling behaviors. Here’s the situation: - **Config**: same overall setup, only changing `global_batch_size`. - **Case...

germanjke

Add pre-tokenized Delta to MDS conversion script

1

## This PR Adds conversion script for pre-tokenized data in a Delta table. ## Testing MCLI IFT and CPT runs trained successfully.

mattyding

data_prep format

Hello! question: in data_prep if I use --concat_tokens k, its divide into chunks of k tokens my all data, but if I want to just take sample from my data...

tsebaka

llm-foundry
llm-foundry copied to clipboard

Metadata

ERROR:composer.cli.launcher:Global rank 0 (PID 208865) exited with code -11

Set use_cache back to True for HF checkpointer

Byod/data validation

draft PR for validation notebook

Adding temperature tuning in attention

Adding support for nope positional encoding in block overrides.

Support interleaving dense and moe

memory bound and global_batch_size

Add pre-tokenized Delta to MDS conversion script

data_prep format

← Metadata

Owner

Metadata

llm-foundry llm-foundry copied to clipboard

Metadata

← Metadata

Owner

Metadata

llm-foundry
llm-foundry copied to clipboard