Isadora White
Results
1
issues of
Isadora White
Hi! Cool paper! :) I have an environment and a dataset of successful/unsuccessful trajectories already. Is there an easy way to simply run DPO training on this trajectory dataset? Best,...