Teknium comments

Results 81 comments of


                                            Teknium

users on windows have an issue with chat generation on alpaca/vicuna/gpt4 models

Since its fixed by reverting hf transformers, I dont think its cuda related?

What are the eos_token_id and bos_token_id

Any updates to this? Are all things good now? Can we fix old models by changing the tokenizer config or?

What are the eos_token_id and bos_token_id

> > @teknium1 You need to retrain on the fixed/updated base HF models. Anything trained using old transformer code on the decapoda models are bound to break. You can hack...

What are the eos_token_id and bos_token_id

> > For everyone's convenience, I've uploaded **llama models converted with the latest transformer git head** here: > > **7B** - https://huggingface.co/yahma/llama-7b-hf **13B** - https://huggingface.co/yahma/llama-13b-hf > > Unfortunately, unlike the...

A100 80 G fine tune llama-65b-hf got CUDAout of Memory

I did both of the suggestions (had bnb 0.37.2 and latest git transformers) but still ran into the issue

(Not So) Bad words list for text generation

Please add this because I have alpaca model and it was trained on a bad dataset with many cases of input and output fields having "" text in them which...

(Not So) Bad words list for text generation

> @teknium1 I think that `bad_words_list` as it is would be enough for your example. But if you still feel something like the `logit_bias` parameter is what you need, react...

[CLI]: RuntimeError: "histogram_cpu" not implemented for 'Char'

I just ran into this error training alpaca-lora.. no fix available now?

[CLI]: RuntimeError: "histogram_cpu" not implemented for 'Char'

> Hi @teknium1 [this PR](https://github.com/wandb/wandb/pull/5283) may fix the issue, but it's currently under review. I will keep this thread updated once it's merged to master branch. Its okay, for whatever...

Inquiry about the maximum number of tokens that Llama can handle

Can you actually just "fine tune more context size"?