Andreas Köpf
Andreas Köpf
# Action Plan for ML-Team ### 1. Data mixes - [ ] create a list of all datasets under consideration for OA SFT, identify datasets that need further processing (e.g....
From an economical and ecological perspective the current "Non-commercial bespoke" model license is sub-optimal and should be changed to a truly liberal open-source license like for example Apache 2.0. In...
Some of our datasets are markdown formatted and others in plain-text. Datasets using strictly markdown (e.g. after conversion from html with a tool) escape special markdown characters like `_` ->...
- add current llama-30 SFT training configuration - print eval dataset sizes - remove some verbose prints during startup
We are currently using [CarperAI/trlx](https://github.com/CarperAI/trlx) for our RL training and we are quite happy with it. Today a RLHF [DeepSpeed-Chat](https://github.com/microsoft/DeepSpeedExamples/blob/master/applications/DeepSpeed-Chat/README.md) example was published for DeepSpeed. We should evaluate what the...
Optional "background" information about the style of the assistant-response to generate should be provided as a prompt via `...` messages (invisible during chat). To make effective use of this ``...
Currently we still need to manually run [sampling_score.py](https://github.com/LAION-AI/Open-Assistant/blob/main/model/model_eval/sampling_score.py) on our [sampling reports](https://open-assistant.github.io/oasst-model-eval/?f=https%3A%2F%2Fraw.githubusercontent.com%2FOpen-Assistant%2Foasst-model-eval%2Fmain%2Fsampling_reports%2Foasst-sft%2F2023-04-17_OpenAssistant_oasst-sft-7e2-llama-30b_sampling_noprefix2.json%0A%0A) after training. In order to simplify the evaluation process and to get a score from our RM earlier...
A broader audience now begins to chat with our models. I saw multiple youtube videos in which people were not sure if OpenAssistant had an internet connection (search etc.) and...
In order to provide inference online at [open-assistant.io/chat/](https://open-assistant.io/chat/) for a longer period of time we need to find a sustainable solution that covers the high costs of operation. This issue...