Dan Saattrup Smart
Dan Saattrup Smart
The vLLM implementation doesn't work properly yet, but the Ollama one does - so I'm now evaluating `ollama_chat/gpt-oss:120b`.
@Mikeriess Actually, if you have GPU capacity, could you evaluate the `ollama_chat/gpt-oss:120b` model, and I can grab the 20b one? FYI: I'm just running on the validation splits for now...
@Mikeriess Their entire model is quantised with MXFP4 AFAIK, it's not an Ollama thing 🙁 And I've tried the guide, and I still get errors, and [so do many others](https://github.com/vllm-project/vllm/issues/22403#issuecomment-3166853703),...
From [the HF model card](https://huggingface.co/openai/gpt-oss-120b):
@Mikeriess Have ollama running (`ollama serve`) separately, and then run EuroEval with the model ID "ollama_chat/gpt-oss:120b". You install ollama with ```bash curl -fsSL https://ollama.com/install.sh | sh ``` Let me know...
@Mikeriess Can you try updating `transformers` and `fbgemm-gpu`?
@Mikeriess Can you send your version of `transformers`? Seems like something weird is happening there.
@Mikeriess There might be a chance that conda is messing things up, making Python not able to detect what's installed. Can you try creating a fresh virtual environment (with pip),...
> Will have a look at this asap. We have this model running on NIM with OpenAI-like API - could that be an alternative? That could work yep! Seems like...
> [@saattrupdan](https://github.com/saattrupdan) I just made a sanity-check with a much smaller model (qwen3-1.7b), but I got this error - any idea what the problem is? It's just during shutdown, your...