Márton Kardos
Márton Kardos
I can't run it cause I don't have compute access today, will run when I can.
Also we should probably wait for the other PR (#694 ) to merge, as the stratified subsampling will fail without the new code.
Yeah, we need to be careful about this! Can't we use a previous release of MTEB for evaluating models for the current leaderboard? I think it would be really nice...
Okay, I will just go for dummy subsampling here as in VG, and we can add stratified subsampling, once it is properly addressed.
I can't run it for E5 as Ucloud is down today and it would take hours on my computer :')
Also added stratified subsampling code to `AbsTask` for multilabel problems, as this was missing.
``` [15:44] There are currently no machines available to run your job. [15:44] A smaller machine might give you quicker access to your job. [15:45] Job has been cancelled ```...
Okay I got a machine halleluyah
Since the stratified subsampling doesn't exactly work as expected with multilabel data, I will just go with a random sample of 2048 entries I think.
@KennethEnevoldsen green light?