Open-Assistant issues

Bump typescript from 4.9.5 to 5.2.2 in /website

1

Bumps [typescript](https://github.com/Microsoft/TypeScript) from 4.9.5 to 5.2.2. Release notes Sourced from typescript's releases. TypeScript 5.2 For release notes, check out the release announcement. For the complete list of fixed issues, check...

dependabot[bot]

dependencies

Dataset release cycle

2

Is there any plan to release the dataset in cycles ? I think in comparison to the V1 dataset it should be grown pretty much

flozi00

data

Orca data filtered loader

2

shahules786

ml

MLC version release request

Hello, as someone who has contributed to the data of Open Assistant, I would really appreciate it if I (and others) could make use of it locally on mobile. To...

GameOverFlowChart

[Bug] : Issue with Changing Plugins During Loading of Answers

**Description:** While using Open Assistant for chat-based interactions, I encountered an issue related to changing plugins during the loading of answers. This behavior seems to result in errors and extended...

shephinphilip

Tokenizers padding_side was not validate to be "right" in trainer_sft.py

1

``` from transformers import AutoTokenizer AutoTokenizer.from_pretrained("OpenAssistant/llama2-13b-orca-8k-3319").padding_side >> 'left' AutoTokenizer.from_pretrained("TheBloke/Llama-2-13B-fp16") >> 'left' AutoTokenizer.from_pretrained("mosaicml/mpt-7b").padding_side >> 'right' AutoTokenizer.from_pretrained("huggyllama/llama-7b").padding_side >> 'left' AutoTokenizer.from_pretrained("OpenAssistant/llama-30b-sft-v8.2-2.4k-steps-system").padding_side >> 'left' ``` Since llama models are using left padding, the supervised...

theblackcat102

bug

ml

For peft trainiing how to handle tokenizer changed？

If the model's num_embeddings is 10000,but we change the tokenizer to 10007. After SFT training the model's num_embeddings will be 10016, that because in model/model_training/utils/utils.py get_model(conf, tokenizer, pad_vocab_size_to_multiple_of=16, check_freeze_layer=True) has...

zhanglu0704

edit doc

1

Edit sft Doc

blancsw

RuntimeError: Timed out initializing process group in store based barrier on rank 2

2

I am trying to run pretrain of LLaMA 30b. And here is my running cmd: ``` deepspeed trainer_sft.py --configs defaults llama-30b-pretrain pretrain --cache_dir $DATA_PATH --output_dir $MODEL_PATH/llama-30b-pre --deepspeed ``` And after...

SingL3

ml

Supervised Fine-tuning with Vicuna

Hi, I've seen vicuna model [here](https://lmsys.org/blog/2023-03-30-vicuna/). It seems like a pretty good model and supports lots of other languages (like Persian) other of the box. It speaks farsi but not...

pourmand1376

Open-Assistant
Open-Assistant copied to clipboard

Metadata

Bump typescript from 4.9.5 to 5.2.2 in /website

Dataset release cycle

Orca data filtered loader

MLC version release request

[Bug] : Issue with Changing Plugins During Loading of Answers

Tokenizers padding_side was not validate to be "right" in trainer_sft.py

For peft trainiing how to handle tokenizer changed？

edit doc

RuntimeError: Timed out initializing process group in store based barrier on rank 2

Supervised Fine-tuning with Vicuna

← Metadata

Owner

Metadata

Open-Assistant Open-Assistant copied to clipboard

Metadata

← Metadata

Owner

Metadata

Open-Assistant
Open-Assistant copied to clipboard