opencompass icon indicating copy to clipboard operation
opencompass copied to clipboard

[Bug] How to evaluate infer results of multipl_e?

Open MichaelYang-lyx opened this issue 2 months ago • 0 comments

Prerequisite

Type

I'm evaluating with the officially supported tasks/models/datasets.

Environment

from gradio_client import Client

Reproduces the problem - code/configuration sample

Default Configuration

  • IP Address: The default IP address for the evaluation service is set to 'https://opencompass-multiple-evaluator.hf.space'.
  • Retries: The default number of retries is set to 5.

Reproduces the problem - command or script

when you use the config of multiple_top_ten_gen_f44aaf.py

Reproduces the problem - error message

json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Other information

Issue: JSONDecodeError when evaluating results with CodeEvaluator

Problem

When trying to evaluate the results of multiple CodeEvaluator instances, an error occurs when setting up the client:

self.client = Client(ip_address)

This raises the following exception:

json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Background

The CodeEvaluator class is used for evaluating code generation tasks by sending code to a remote evaluation service. It handles code extraction, processing, and result analysis. Here's the relevant code snippet from the CodeEvaluator class:

class CodeEvaluator(BaseEvaluator):
    """Evaluator for code generation tasks.

    This evaluator sends code to a remote evaluation service to test its
    functionality against provided test cases. It handles code extraction,
    processing, and result analysis.
    """

    def __init__(self,
                 language: str = 'py',
                 ip_address: str = 'localhost',
                 retry: int = 5) -> None:
        """Initialize the CodeEvaluator.

        Args:
            language (str): Programming language of the code to evaluate.
            ip_address (str, optional): IP address of the evaluation service. Defaults to 'localhost'.
            retry (int, optional): Number of retry attempts for failed connections. Defaults to 3.
        """
        self.language = language
        self.retry = retry

        self.client = Client(ip_address)

Default Configuration

  • IP Address: The default IP address for the evaluation service is set to 'https://opencompass-multiple-evaluator.hf.space'.
  • Retries: The default number of retries is set to 5.

Issue Details

Upon execution of self.client = Client(ip_address), the following error is encountered:

json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

And I find no other way to evaluate my results of multipl_e in opencompass.

MichaelYang-lyx avatar Sep 09 '25 09:09 MichaelYang-lyx