Nathan Lambert issues

Results 60 issues of


                                            Nathan Lambert

Dataset v2 discussion & feedback

Hey! Post any questions or complaints on the dataset. We'll log our internal goals and limitations here too. 1. It was pointed out by [Rishabh Agarwal](https://agarwl.github.io/) that the PRM Math...

question

Visualization requests

Some things to add: - [ ] Pareto distribution of any Section or Subset Comment anything else (or just watch my notes)

1. Make it so you can run inference over individual text prompts (rather than chosen + rejected) 2. Clean up nograd/detach (see https://twitter.com/shxf0072/status/1771220126655811610), but should be pretty obvious 3. Add...

enhancement

good first issue

Add Nvidia RMs (and Nemo compatibility)

See below! https://huggingface.co/models?library=nemo&search=RM This involves converting to HF format or adding nemo compatibility, if anyone has time!

enhancement

New Model

Experiment with human vs gpt4 data

With the human data AI2 has or a dataset like `no_robots`, we could test if a RM prefers the human or model answers to a completion.

enhancement

Best of N benchmark

1. Take a few chat models as the “base set”, say 1-3, like tulu 2 7b and tulu 2 13b (maybe olmo-instruct) 2. Generate ~8 completions per prompt in AlpacaEval...

🚀 New Contributors

Hey all! drop a comment if you want to contribute, introduce yourself if you want.

Clean up example's directory

Some todo's, feel free to add more! - [ ] Add readme of examples - [ ] Convert examples to markdown, maybe put in folders and render with a Readme?

documentation

Logging audio issues

- [x] Leading spaces in text causes weird stuff before speaking, remove them - [x] Things like $1mil are hard to filter but don't work, maybe an LLM can rephrase...

Nathan Lambert

Dataset v2 discussion & feedback

Visualization requests

Clean up / enhance DPO code

Add Nvidia RMs (and Nemo compatibility)

Experiment with human vs gpt4 data

Best of N benchmark

🚀 New Contributors

Clean up example's directory

Logging audio issues

Add other TTS models (via API key people are used to using)