Andreas Köpf comments

Results 365 comments of


                                            Andreas Köpf

New encourage message UI

An argument against secondary thumbs buttons outside the message box: The context might not be 100% clear, e.g. it could be interpreted as judgement for the whole conversation (because individual...

New encourage message UI

If it is only about the labels beside the thumbs icons, I would suggest: - Good/Bad - Great/Poor

Show a message to ask the user to report after a thumbs down

I close this issue because I don't think we need to specifically encourage users to red-flag messages. Red-flagging should be an exception to call a moderator because a message is...

While we definitely should use [QLoRA](https://arxiv.org/abs/2305.14314) (a groundbreaking result for the whole ML community) and only try a super-high quality final fine-tuning run (like OA top-1 threads, i.e. as it...

Switch to filtered prosocial dataset

Thanks a lot for working on prosocial, we got some negative comments for SFT-8 (not deployed yet) which used 15% of prosocial-dialog and had an unfiltered version of gpt4all. The...

Switch to filtered prosocial dataset

(@TCLProject if you want to help us determining the OA SFT-9 dataset mix, please contact Ollie or me via DM on discord .. almostEvil___ is coordinating the SFT-9 project.)

Resolve issue with ShareGPT_vicuna_unfiltered

There is now also: [Aeala/ShareGPT_Vicuna_unfiltered](https://huggingface.co/datasets/Aeala/ShareGPT_Vicuna_unfiltered)

NTK-Aware Scaled RoPE allows LLaMA models to have extended (8k+) context size without any fine-tuning and minimal perplexity degradation.

The associated PR #529 seems to add post-hoc RoPE scaling (for models trained without scaling). Now that linear & dynamic rope scaling got merged into transformers (https://github.com/huggingface/transformers/pull/24653) more models will...

The Biggest Problem with Open Assistant Right Now

@JuliaBonita I understand that you and a lot of people are unhappy about the ending of open-assistant. Our core problem was simply that all founding team members had other obligations...

Supervised fine-tuning: "RuntimeError: expected scalar type Half but found Float" during evaluation

It's interesting that it occurs during eval. I asked @jordiclive and he said that he has trained several llama lora models in fp16 including 7B. If you want to debug...