llm-foundry adding option for softcap in attention and lm

adding option for softcap in attention and lm_head logits.

Open ShashankMosaicML opened this issue 1 year ago • 0 comments

Adding option for softcap in attention and lm_head logits, to allow Gemma-like models. The config names are same as the huggingface names here: https://github.com/huggingface/transformers/blob/96a074fa7e2c04b904f72d9e827398d4c5f90f25/src/transformers/models/gemma2/modeling_gemma2.py#L371

Jul 19 '24 19:07 ShashankMosaicML

llm-foundry llm-foundry copied to clipboard

adding option for softcap in attention and lm_head logits.

llm-foundry
llm-foundry copied to clipboard