direct-preference-optimization
direct-preference-optimization copied to clipboard
How are evals done on trained models?
Thanks for putting this together. I am wondering how are evals done on trained models.
Are there some third-party evaluation libraries that you use to measure trained model performance/metric, or some other eval code available?
Thanks in advance.