open-instruct
open-instruct copied to clipboard
initial judge refactor
refactor to allow for LM-judge based "verifer"
- general
VeriferConfigso that we can configure verifers per training run (builder pattern). Not much in there now but easily extendable to any verifiers we want to configure in the future. - verifiers now run async/in parallel by default. This should be a small perf bonus for efficient verifers (like math) but hugely important for slow verifiers (like LM as a judge and code execution)
test beaker training runs: jupiter run here augusta run here