Márton Kardos

Results 67 comments of Márton Kardos

Also including Dummy classifier scores gives us a relatively good idea of chance level in this multilabel case.

E5 definitely performs better on the task than paraphrase-multilingual. I'm not sure about the subcategories, might be a bit too much for some tasks. Though we could include it if...

Also specific tasks are free to use whatever they want, like if you see an MLP more fit you can specify it in the task. What are your thoughts on...

I was just thinking about this yesterday, I think I can take it :D

This is again, the same kind of issue as with the classification: 1. Do we only consider hierarchical clustering? 2. If so, do we penalise models for having gotten something...

I think versioning the models would make a lot of sense. Also requiring people to specify what has changed between the versions would be very useful (a changelog of some...

I totally agree with @KennethEnevoldsen here. I think keep things nice and separate would amount to us making the most stupidly simple implementations of everything here in MTEB and abstract...

Also, what if the model is not usable in SentenceTransformers? We would ideally still have some version or revision on those. With proprietary embedding models it might also be reasonable...

We might get reasonable coverage of Macedonian by including Bulgarian, they are as close as Bokmål and Danish pretty much. Same thing goes for Serbian and Croatian (maybe also Slovenian)...

Also might consider some minority languages and regional dialects for fairness's and ethics's sake, a handful of examples I can think of: - Romani is a minority language in a...