autogen icon indicating copy to clipboard operation
autogen copied to clipboard

[Bug]: AgentEval Notebook agenteval_cq_math cannot retrieve math problem examples

Open DLWCMD opened this issue 9 months ago • 7 comments

Describe the bug

This notebook uses the AgentEval framework supported by two math problem results. However, the results cannot be read.

The cell defines function read_without_groundtruth to read the results as shown in the code snippet below.

response_successful = read_without_groundtruth(
    "../test/test_files/agenteval-in-out/samples/sample_math_response_successful.txt"
)[0]
response_failed = read_without_groundtruth(
    "../test/test_files/agenteval-in-out/samples/sample_math_response_failed.txt"

However, in both cases, the read fails with a FileNotFoundError and ValueError:

if file in {0, 1, 2}:
    304     raise ValueError(
    305         f"IPython won't let you open fd={file} by default "
    306         "as it is likely to crash IPython. If you know what you are doing, "
    307         "you can use builtins' open."
    308     )
--> 310 return io_open(file, *args, **kwargs)

Steps to reproduce

  1. Load the notebook
  2. Prepare the notebook to be run
    • Create OAI_CONFIG_LIST
    • Set API Endpoint
  3. Run the notebook.

Failure as reported above will occur in the cell which defines the read_without_groundtruth function.

Model Used

gpt-4

Expected Behavior

The math logs should have been available to read and provide the two outcome examples required for the execution of the notebook.

Screenshots and logs

Screenshot 2024-05-04 at 5 13 46 PM

Additional Information

No response

DLWCMD avatar May 04 '24 21:05 DLWCMD

cc @julianakiseleva @jluey1

ekzhu avatar May 05 '24 04:05 ekzhu

@DLWCMD can you please verify that you have pulled the required files from the repo, namely:

test/test_files/agenteval-in-out/samples/sample_math_response_failed.txt

julianakiseleva avatar May 06 '24 23:05 julianakiseleva

I was able to obtain the required files, but only after I loaded the entire repo. It is not obvious where these files can be found.

Thanks for following up.

From: Julia Kiseleva @.> Reply-To: microsoft/autogen @.> Date: Monday, May 6, 2024 at 7:54 PM To: microsoft/autogen @.> Cc: DLW @.>, Mention @.***> Subject: Re: [microsoft/autogen] [Bug]: AgentEval Notebook agenteval_cq_math cannot retrieve math problem examples (Issue #2591)

@DLWCMDhttps://github.com/DLWCMD can you please verify that you have pulled the required files from the repo, namely:

test/test_files/agenteval-in-out/samples/sample_math_response_failed.txt

— Reply to this email directly, view it on GitHubhttps://github.com/microsoft/autogen/issues/2591#issuecomment-2097113227, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALUME6ROUTND3SLRYJKBQZDZBAJ3HAVCNFSM6AAAAABHHD33KSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJXGEYTGMRSG4. You are receiving this because you were mentioned.Message ID: @.***>

DLWCMD avatar May 07 '24 00:05 DLWCMD

Does it work for you now? If it does -- can you please close the issue?

julianakiseleva avatar May 07 '24 00:05 julianakiseleva

It does, with one exception. I will include that when I close the issue?

Thanks for your prompt attention to this matter.

David L. Wilt 3272 Bayou Road Longboat Key, Florida 34228 540-420-0844 @.@.>

From: Julia Kiseleva @.> Reply-To: microsoft/autogen @.> Date: Monday, May 6, 2024 at 8:36 PM To: microsoft/autogen @.> Cc: DLW @.>, Mention @.***> Subject: Re: [microsoft/autogen] [Bug]: AgentEval Notebook agenteval_cq_math cannot retrieve math problem examples (Issue #2591)

Does it work for you now? If it does -- can you please close the issue?

— Reply to this email directly, view it on GitHubhttps://github.com/microsoft/autogen/issues/2591#issuecomment-2097149792, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALUME6U2WULIQGPMRCJYMQLZBAOZVAVCNFSM6AAAAABHHD33KSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJXGE2DSNZZGI. You are receiving this because you were mentioned.Message ID: @.***>

DLWCMD avatar May 07 '24 00:05 DLWCMD

please let us know what exception are you getting?

julianakiseleva avatar May 07 '24 00:05 julianakiseleva

I will respond later today with all details, including closing this issue.

From: Julia Kiseleva @.> Reply-To: microsoft/autogen @.> Date: Monday, May 6, 2024 at 8:51 PM To: microsoft/autogen @.> Cc: DLW @.>, Mention @.***> Subject: Re: [microsoft/autogen] [Bug]: AgentEval Notebook agenteval_cq_math cannot retrieve math problem examples (Issue #2591)

please let us know what exception are you getting?

— Reply to this email directly, view it on GitHubhttps://github.com/microsoft/autogen/issues/2591#issuecomment-2097161034, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ALUME6SXWQ56AJNO3TOJZSDZBAQPVAVCNFSM6AAAAABHHD33KSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJXGE3DCMBTGQ. You are receiving this because you were mentioned.Message ID: @.***>

DLWCMD avatar May 07 '24 12:05 DLWCMD

Here is the warning I mentioned. It is associated with results plotting, but I am not clear if or how the warning affects the displayed plots.

/Users/david/anaconda3/envs/autogen_env/lib/python3.12/site-packages/scipy/stats/_distn_infrastructure.py:2246: RuntimeWarning: invalid value encountered in multiply lower_bound = _a * scale + loc /Users/david/anaconda3/envs/autogen_env/lib/python3.12/site-packages/scipy/stats/_distn_infrastructure.py:2247: RuntimeWarning: invalid value encountered in multiply upper_bound = _b * scale + loc

Further, as you know, this notebook uses the AgentEval framework supported by two math problem results. They are not available to the notebook under its current configuration. I resolved the issue by cloning the entire Autogen repo, which includes these two math problems. Doing so made the problems available to this notebook. I believe the best approach is to add them directly to the notebook, as opposed to loading them.

Hope this helps. Let me know if you have any questions.

David Wilt

DLWCMD avatar May 07 '24 16:05 DLWCMD