scandinavian-embedding-benchmark
scandinavian-embedding-benchmark copied to clipboard
A Scandinavian Benchmark for sentence embeddings
https://github.com/kuhumcst/danish-semantic-reasoning-benchmark
Extending the dataset to other Scandinavian languages **These resources should be checked before implementing on whether they are translated or not:** - Greenlandic - Danish-Greenlandic - Greenlandic news - Icelandic...
E.g. for ScaLA it is natural text, but synthetically augmented (and humanly evaluated). Other construction methods could include translations. Others could be found or expert-generated. It is probably reasonable to...
https://arxiv.org/abs/2402.15449
Add metadata on socioeconomic status to the datasets.