Open-Assistant Create a Tool for ML team to evaluate sampling configurations

We have now a little python script that generates completions for random prompts from our dataset with different sampling parameters. The result is stored as a json report. We want to use it to manually compare outputs different base-models and the fine-tuning result models. It would be super-cool if someone from the web team could help us to build a comparison UI that allows to inspect 10+ report files.

The json reports have the following properties:

each file contains information about the models
each file contains a list of prompts (which are always identical between the different models, same prompts in the same order),
- for each prompt there are multiple sampling configurations (which are also identical across models per prompt, but NOT across prompts)
  - for each sampling configuration we have the same number of outputs.

Example report files:

https://github.com/LAION-AI/Open-Assistant/tree/main/model/model_eval/manual/sampling_reports

The UI could for example show the following:

sampling-drop down: [beam5]
Prompt: "How to protect my eyes when I have to stare at my computer screen for longer than 10 hours every day?"

theblackcat102/pythia-1b-deduped-sft: "It's not good to stare at your computer screen for longer than 10 hours.",

theblackcat102/pythia-3b-deduped-sft: "1. Open your eyes as wide as possible.\n2. Close your eyes for a few seconds.\n3. Open your eyes again.\n4. Close your eyes for a few seconds.\n5. Open your eyes."

theblackcat102/pythia-12b-deduped-sft: "You should take regular breaks from staring at your computer screen."
.. show results for all the prompts (e.g. 100) ...

Mar 02 '23 18:03 AbdBarho

I can work on this

Mar 02 '23 18:03 johnflux

Here, have a play: https://johnflux.github.io/Open-Assistant-Model-Comparer/ Code is: https://github.com/johnflux/Open-Assistant-Model-Comparer

Mar 03 '23 06:03 johnflux

Would it be possible to integrate this into our website? more precisely, into our docusaorus docs?

https://github.com/LAION-AI/Open-Assistant/tree/main/docs

Mar 03 '23 06:03 AbdBarho

https://github.com/Open-Assistant/oasst-model-eval

Mar 04 '23 08:03 AbdBarho