xgen
xgen copied to clipboard
Salesforce open-source LLMs with 8k sequence length.
Bumps [transformers](https://github.com/huggingface/transformers) from 4.29.2 to 4.36.0. Release notes Sourced from transformers's releases. v4.36: Mixtral, Llava/BakLlava, SeamlessM4T v2, AMD ROCm, F.sdpa wide-spread support New model additions Mixtral Mixtral is the new...
A previous GH issue ([here](https://github.com/salesforce/xgen/issues/8)) mentions that a modified version of this script ([here](https://github.com/hendrycks/test/pull/13/files)) was used to collect MMLU numbers. What about scripts for other benchmarks in the blogpost? As...
Hi all, I've been able to get xgen7b to work with sentence completion using GPU but cannot get it to work with question/answer. The code I'm using is below: ```...
Hi Team, can you please share the finetuning codebase (used in Salesforce/xgen-7b-8k-inst) ?
I need help. When using model = "Salesforce/xgen-7b-4k-base" or model = "Salesforce/xgen-7b-8k-base" [/usr/local/lib/python3.10/dist-packages/transformers/models/auto/tokenization_auto.py](https://localhost:8080/#) in from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs) 687 tokenizer_class = tokenizer_class_from_name(tokenizer_class_candidate) 688 if tokenizer_class is None: --> 689 raise...
Am I missing something? The blog post refers to "our open-sourced codebase", but I see no code. What am I missing?
Hi, could you please release the training data too, to enable further research into the model behavior ? Other projects like EleuterAI's pythia project have done that, which has helped...
Hey, Amazing work! Can we expect a followup with larger models(13B, 30B and 65B)? Also I think combining your method with https://arxiv.org/abs//2306.15595 would be amazing to get an open source...