mteb icon indicating copy to clipboard operation
mteb copied to clipboard

Autogenerate table of tasks

Open KennethEnevoldsen opened this issue 11 months ago • 7 comments

Seems like the table of tasks can be autogenerated from the existing metadata.

I would suggest removing it from the readme and then linking to a new autogenerated markdown.

What are some thoughts on this?

KennethEnevoldsen avatar Mar 18 '24 09:03 KennethEnevoldsen

In the same vein, I'd love if the metadata was a dataclass as well. This makes it easy to:

  • Type-hint the attributes
  • Auto-complete in IDEs
  • Get runtime-errors if keys/attributes are missing

I've created this as a separate issue, #257.

MartinBernstorff avatar Mar 18 '24 11:03 MartinBernstorff

Seems like the table of tasks can be autogenerated from the existing metadata.

I would suggest removing it from the readme and then linking to a new autogenerated markdown.

What are some thoughts on this?

Yep let's go! Autogeneration >>> manual generation 🚀

imenelydiaker avatar Mar 18 '24 15:03 imenelydiaker

Wonderful glad we agree! Will see if I can find the time to implement it

KennethEnevoldsen avatar Mar 18 '24 16:03 KennethEnevoldsen

This PR awaits #260. @MartinBernstorff Since the table contains the size of the datasets and it would be problematic to fetch all datasets to generate it, it might be ideal to the dataset sizes to the metadata. What do you think?

KennethEnevoldsen avatar Mar 19 '24 16:03 KennethEnevoldsen

Makes a lot of sense. There might be some work in formalising what "size" means for each task type, but let's cross that bridge when we get there!

MartinBernstorff avatar Mar 20 '24 08:03 MartinBernstorff

Screenshot 2024-03-20 at 09 41 21

^was just thinking of this

KennethEnevoldsen avatar Mar 20 '24 08:03 KennethEnevoldsen

This seems to be fixed by #525 and #541 . Should it be closed?

HLasse avatar Apr 24 '24 12:04 HLasse