mteb
mteb copied to clipboard
Making ScalaClassification multilingual
Hey @KennethEnevoldsen, it seems like you added ScalaDaClassification
, right? I was wondering if there's a reason why it's listed as a multilingual (i.e., some more langs support planned) or is that a glitch?
Happy to offer a quick fix that if needed.
Hello, in the file there is more than one task using different versions of this dataset with different languages.
A better formulation would be to make one ScalaClassification
task that is a MultilingualTask
and create one dataset repository with config_names
containing the language as it has been done in PR https://github.com/embeddings-benchmark/mteb/pull/575
Ah, sorry I missed that cause in my editor it pulls up a single class/function without entire module. But yeah you are right that the way in PR makes it more explicit. I'll change a title to more appropriate and leave it open to pick up, but I guess it's not priority atm.
@dokato you can open a PR if you feel so, and also ask to join the hugging face organization to be able to create a repository on mteb and upload the data.
Ok, I’ll give it a go
Yea no reason why they are monolingual, would love to see a PR on this