Dan Saattrup Smart

Results 70 comments of Dan Saattrup Smart

It is now [this vLLM PR](https://github.com/vllm-project/vllm/pull/4558), which contains the fix - waiting for that one to be merged and published.

Fixed in https://github.com/ScandEval/ScandEval/releases/tag/v12.10.5 🎉

Bug! https://github.com/ScandEval/ScandEval/issues/271

Another example: https://docplayer.nl/55120193-Dit-is-een-oefentoets-knm-voor-het-inburgeringsexamen-print-deze-toets-uit-elke-vraag-is-multiple-choice-u-mag-geen-woordenboek-gebruiken.html

To optimise evaluation, install the following libraries: ``` pip install causal-conv1d>=1.2.0 pip install mamba-ssm ```

Requires `transformers>=4.39.0`, waiting with evaluation until this is released.

Evaluating this with vLLM depends on [this PR](https://github.com/vllm-project/vllm/pull/6484).

> I am currently testing with google/t5-v1_1-base, and at least SweReC seems to work. What problems would I expect to encounter? Thanks! No expected problems per se, but I'd appreciate...

This raises an `scandeval.exceptions.InvalidBenchmark: NaN value detected in model outputs, even with mixed precision disabled.` exception. Sometimes mixed precision isn't disabled correctly, so I will try benchmarking it in full...

> This raises an `scandeval.exceptions.InvalidBenchmark: NaN value detected in model outputs, even with mixed precision disabled.` exception. Sometimes mixed precision isn't disabled correctly, so I will try benchmarking it in...