Márton Kardos
Márton Kardos
@KennethEnevoldsen What's the status on this? Would we need to hardcode a test like that? I'm also thinking it would probably be a good idea to add `superseeded_by` to `AbsTask`s...
That's actually a great idea we probs have a dataset for that in-house don't we?
Hmm yeah it would be nice if we could make it faster somehow, especially if we're planning on bootstrapping stuff, then it's a really good idea.
It would also be cool to have a speed vs. performance plot like in the MTEB paper.
The later levels seem very hard. Maybe we should limit the levels to two?
I'm not sure whether the way I formulated the task makes sense. @rafalposwiata You added the dataset initially, therefore you might know: Is the "disciplines" column hierarchically ordered or just...
I'm not sure though. The task formulation might be wrong. I think doing "scientific_fields" as first level and "disciplines" as the second might be the way to go. From what...
Yes, unless the order is not fixed, and I don't know if it is (we have to check)
Nope, it's not hierarchical at all. We can maybe rephrase it as multilabel classification if we really want to, otherwise fine to leave it as flat clustering.
Are we sure we want machine generated datasets? If we don't take machine-translated ones why should we take machine generated ones?