opencompass
opencompass copied to clipboard
[Bug] How to evaluate infer results of multipl_e?
Prerequisite
- [x] I have searched Issues and Discussions but cannot get the expected help.
- [x] The bug has not been fixed in the latest version.
Type
I'm evaluating with the officially supported tasks/models/datasets.
Environment
from gradio_client import Client
Reproduces the problem - code/configuration sample
Default Configuration
- IP Address: The default IP address for the evaluation service is set to
'https://opencompass-multiple-evaluator.hf.space'. - Retries: The default number of retries is set to
5.
Reproduces the problem - command or script
when you use the config of multiple_top_ten_gen_f44aaf.py
Reproduces the problem - error message
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Other information
Issue: JSONDecodeError when evaluating results with CodeEvaluator
Problem
When trying to evaluate the results of multiple CodeEvaluator instances, an error occurs when setting up the client:
self.client = Client(ip_address)
This raises the following exception:
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Background
The CodeEvaluator class is used for evaluating code generation tasks by sending code to a remote evaluation service. It handles code extraction, processing, and result analysis. Here's the relevant code snippet from the CodeEvaluator class:
class CodeEvaluator(BaseEvaluator):
"""Evaluator for code generation tasks.
This evaluator sends code to a remote evaluation service to test its
functionality against provided test cases. It handles code extraction,
processing, and result analysis.
"""
def __init__(self,
language: str = 'py',
ip_address: str = 'localhost',
retry: int = 5) -> None:
"""Initialize the CodeEvaluator.
Args:
language (str): Programming language of the code to evaluate.
ip_address (str, optional): IP address of the evaluation service. Defaults to 'localhost'.
retry (int, optional): Number of retry attempts for failed connections. Defaults to 3.
"""
self.language = language
self.retry = retry
self.client = Client(ip_address)
Default Configuration
- IP Address: The default IP address for the evaluation service is set to
'https://opencompass-multiple-evaluator.hf.space'. - Retries: The default number of retries is set to
5.
Issue Details
Upon execution of self.client = Client(ip_address), the following error is encountered:
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
And I find no other way to evaluate my results of multipl_e in opencompass.