Dan Saattrup Smart comments

Results 70 comments of


                                            Dan Saattrup Smart

[BUG] Outlines version clash with vLLM

It is now [this vLLM PR](https://github.com/vllm-project/vllm/pull/4558), which contains the fix - waiting for that one to be merged and published.

[BUG] Outlines version clash with vLLM

Fixed in https://github.com/ScandEval/ScandEval/releases/tag/v12.10.5 🎉

[MODEL EVALUATION REQUEST] ThatsGroes/Munin-NeuralBeagle-SkoleGPT-instruct

Bug! https://github.com/ScandEval/ScandEval/issues/271

[BENCHMARK DATASET REQUEST] Inburgeringsexamen

Another example: https://docplayer.nl/55120193-Dit-is-een-oefentoets-knm-voor-het-inburgeringsexamen-print-deze-toets-uit-elke-vraag-is-multiple-choice-u-mag-geen-woordenboek-gebruiken.html

[MODEL EVALUATION REQUEST] Mamba-2.8b

To optimise evaluation, install the following libraries: ``` pip install causal-conv1d>=1.2.0 pip install mamba-ssm ```

[MODEL EVALUATION REQUEST] Mamba-2.8b

Requires `transformers>=4.39.0`, waiting with evaluation until this is released.

[MODEL EVALUATION REQUEST] Mamba-2.8b

Evaluating this with vLLM depends on [this PR](https://github.com/vllm-project/vllm/pull/6484).

[FEATURE REQUEST] Support seq-to-seq architectures

> I am currently testing with google/t5-v1_1-base, and at least SweReC seems to work. What problems would I expect to encounter? Thanks! No expected problems per se, but I'd appreciate...

[MODEL EVALUATION REQUEST] intfloat/multilingual-e5-large-instruct

This raises an `scandeval.exceptions.InvalidBenchmark: NaN value detected in model outputs, even with mixed precision disabled.` exception. Sometimes mixed precision isn't disabled correctly, so I will try benchmarking it in full...

[MODEL EVALUATION REQUEST] intfloat/multilingual-e5-large-instruct

> This raises an `scandeval.exceptions.InvalidBenchmark: NaN value detected in model outputs, even with mixed precision disabled.` exception. Sometimes mixed precision isn't disabled correctly, so I will try benchmarking it in...