Dan Saattrup Smart
Dan Saattrup Smart
I like the idea of testing what the chance is for a model to have "cheated" on a benchmark. However, the method in the paper that you link to requires,...
Makes sense! In that case we probably need to create a `gemini_models` module, analogous to `openai_models`, which has classes `GeminiTokenizer` and `GeminiModel`, as well as a `model_setups.gemini` module analogous to...
I'm not familiar with Llamaindex, but evaluating the closed models currently involve more than "merely" generating sequences. JSON mode is heavily used for NER tasks, and we use the logits...
@Mikeriess This one yep (and in general Mixtral-type models) - thanks 🙂
> Allright - this model is gated, so I'll need to use my access token. What is the name of the argument to add here? (couldnt find it in the...
> Getting a `[rank0]: torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 134.00 MiB. GPU `. I assume 96GB VRAM isnt enough for this model :-) That's exactly the reason...
This is live on the leaderboards now, thanks to @Mikeriess! 🎉
This seems similar-ish to [this issue](https://github.com/ROCm/ROCm/issues/2536#issuecomment-1755682831). Can you see if any of these, or combinations of them, work? > export PYTORCH_ROCM_ARCH="gfx1031" > export HSA_OVERRIDE_GFX_VERSION=10.3.1 > export HIP_VISIBLE_DEVICES=0 > export ROCM_PATH=/opt/rocm
Progress! What happens if you set `AMD_SERIALIZE_KERNEL=3`? Maybe we'll get a more informative error.