bugbug icon indicating copy to clipboard operation
bugbug copied to clipboard

[code_review] Explore different models

Open marco-c opened this issue 1 year ago • 4 comments

Similar to #4582, but across different models.

This depends on #4580 for the evaluation.

marco-c avatar Nov 01 '24 10:11 marco-c

We have this as part of the experimental mode. We return the results generated by each of the models/configurations to the user for evaluation.

The following are the configurations in the experimental mode:

  • gpt-4o temp 0.2
  • gpt-4o temp 0.8
  • claude-3-5 temp 0.2
  • ~gemini-1.5-pro temp 0.2~ (disabled due to a quota limitation error)

suhaibmujahid avatar Nov 27 '24 01:11 suhaibmujahid

I deployed a new version of review helper, which enables back gemini-1.5-pro.

So currently, the following are the configurations in the experimental mode:

  • gpt-4o temp 0.2
  • gpt-4o temp 0.8
  • claude-3-5 temp 0.2
  • gemini-1.5-pro temp 0.2

suhaibmujahid avatar Jan 08 '25 15:01 suhaibmujahid

I deployed a new version of review helper, which enables back gemini-1.5-pro.

So currently, the following are the configurations in the experimental mode:

* gpt-4o temp 0.2

* gpt-4o temp 0.8

* claude-3-5 temp 0.2

* gemini-1.5-pro temp 0.2

And after https://github.com/mozilla/bugbug/pull/4731, it's going to be Gemini 2.0 Flash instead of Gemini 1.5 Pro.

marco-c avatar Jan 08 '25 15:01 marco-c

@suhaibmujahid could you share the latest stats here, and the query you're using to get them (so I can run it myself easily as well)?

marco-c avatar Apr 22 '25 12:04 marco-c