LongWriter
LongWriter copied to clipboard
how to use a local LLM to evaluate prediction quality? For example, Llama-3-70B-Instruct?
Feature request / 功能建议
how to use a local LLM to evaluate prediction quality? For example, Llama-3-70B-Instruct?
Motivation / 动机
how to use a local LLM to evaluate prediction quality? For example, Llama-3-70B-Instruct?
Your contribution / 您的贡献
how to use a local LLM to evaluate prediction quality? For example, Llama-3-70B-Instruct?
Hi, you can use the evaluation code in eval_quality.py to evaluate the generation quality. Substitute the API reference call get_response_gpt4 with your local LLM model call.