Márton Kardos comments

Results 67 comments of


                                            Márton Kardos

WIP: Multilabel classification

Also including Dummy classifier scores gives us a relatively good idea of chance level in this multilabel case.

WIP: Multilabel classification

E5 definitely performs better on the task than paraphrase-multilingual. I'm not sure about the subcategories, might be a bit too much for some tasks. Though we could include it if...

WIP: Multilabel classification

Also specific tasks are free to use whatever they want, like if you see an MLP more fit you can specify it in the task. What are your thoughts on...

Multilayer/Hiarchical Clustering

I was just thinking about this yesterday, I think I can take it :D

Multilayer/Hiarchical Clustering

This is again, the same kind of issue as with the classification: 1. Do we only consider hierarchical clustering? 2. If so, do we penalise models for having gotten something...

Model registry: A proposal

I think versioning the models would make a lot of sense. Also requiring people to specify what has changed between the versions would be very useful (a changelog of some...

Model registry: A proposal

I totally agree with @KennethEnevoldsen here. I think keep things nice and separate would amount to us making the most stupidly simple implementations of everything here in MTEB and abstract...

Model registry: A proposal

Also, what if the model is not usable in SentenceTransformers? We would ideally still have some version or revision on those. With proprietary embedding models it might also be reasonable...

Add a European Benchmark

We might get reasonable coverage of Macedonian by including Bulgarian, they are as close as Bokmål and Danish pretty much. Same thing goes for Serbian and Croatian (maybe also Slovenian)...

Add a European Benchmark

Also might consider some minority languages and regional dialects for fairness's and ethics's sake, a handful of examples I can think of: - Romani is a minority language in a...