Lorenzo Pacchiardi

Results 11 issues of Lorenzo Pacchiardi

I am trying to train the squeezenet or alexnet architecture on a part of ImageNet dataset (in particular, using just a small number of classes). I tried with many choices...

The `train_distributed.py` script imports from `model_builders.py` the function `build_squeezenet_fastfood`, but this is not present there, nor in any other file in the repository. Could you please add it?

Fix #1504 ## Final checklist 👀 ### Submission agreement By contributing to Evals, you are agreeing to make your evaluation logic and data under the same MIT license as this...

### Describe the bug The response to issue #512 implemented a way to dynamically change API parameters (such as temperature) from the CLI (by looking at the [code](https://github.com/openai/evals/blob/5a92ac38155cb32dcde1cc8b69b5e002e9437532/evals/cli/oaieval.py#L34-L39), the argument...

bug

### Describe the feature or improvement you're requesting The current implementation of the [`Match`](https://github.com/openai/evals/blob/main/evals/elsuite/basic/match.py) basic eval template is case-sensitive. This leads to results such as: `{'correct': False, 'expected': 'no', 'picked':...

### Describe the feature or improvement you're requesting [build_eval.md](https://github.com/openai/evals/blob/main/docs/build-eval.md#for-model-graded-evals-a-step-by-step-workflow) says: > In general, the evaluation model and the model being evaluated don't have to be the same, though we will...

The [documentation](https://github.com/openai/evals/blob/main/docs/eval-templates.md#the-model-graded-eval-template) for modelgraded evals says: > In general, the evaluation model and the model being evaluated don't have to be the same, though we will assume that they are...

Hi, thank you so much for coding this, it is helpful. I have installed everything as required (on Ubuntu 20.04) and tried running it on the provided example. However, the...

Currently, `evals` attempts calling gpt-4o models via the Completion API, which is however incorrect. Simply adding gpt-4o to the list of models to be called with the Chat API fixes...