Nathan Lambert

Results 175 comments of Nathan Lambert

I'm not a VLLM contributor (at least heavily, I may have had a PR I don't remember), but I'm a heavy reward model user and a heavy infrastructure builder (you...

@zhuzilin I think the initial implementation is good at a quick pass. It covers the biggest things. (mostly acknowledging that I did, but without using it I am unlikely to...

Hey @RobinsonKO -- looking into this. For example the sort of command ran with is this -- **note I didn't check the exact hyperparameters, copied from one of the SFT...

Can you say more @RobinsonKO ? I bet the OLMES repo has become outdated to our setup a bit (you can see its not updated often), so it'll be hard...

@lyybonnie can you say more? Link 404s