mteb icon indicating copy to clipboard operation
mteb copied to clipboard

CodeSearchNet task

Open Rabrg opened this issue 2 years ago • 9 comments

would the maintainers be interested in the addition of a code retrieval task (CodeSearchNet, uses text queries to retrieve code documents), either as a new code retrieval type or added into the existing retrieval category?

Rabrg avatar Nov 18 '22 06:11 Rabrg

would the maintainers be interested in the addition of a code retrieval task (CodeSearchNet, uses text queries to retrieve code documents), either as a new code retrieval type or added into the existing retrieval category?

Yes definitely! I think it would be best to add it to the existing retrieval category if possible. It would require some changes in the code to differentiate between BEIR and non-BEIR, but should not be too difficult.

The language splits would then be the coding languages go, java etc.

Do you want to tackle this? Very happy to help along the way! 🤗

Muennighoff avatar Nov 18 '22 06:11 Muennighoff

i haven't gotten around to this yet, but would be happy to give it a shot

Rabrg avatar Dec 02 '22 17:12 Rabrg

i haven't gotten around to this yet, but would be happy to give it a shot

Amazing! Let me know when you run into problems 👍

Muennighoff avatar Dec 02 '22 17:12 Muennighoff

Any progress on this? :)

Manouchehri avatar Dec 05 '23 19:12 Manouchehri

I think @bwanglzu & team have also been working on this - Let us know if we can help in any way!

Muennighoff avatar Dec 05 '23 19:12 Muennighoff

Oh nice! Is there a fork or WIP script anywhere?

Manouchehri avatar Dec 05 '23 19:12 Manouchehri

hi @Muennighoff, @Manouchehri and all, yes we're working on this. Maybe a bit more, we'll add ~3 coding tasks to MTEB :)

reason behind this is we're training a coding embedding model, similar as: jina-embeddings-v2 :)

bwanglzu avatar Dec 06 '23 10:12 bwanglzu

hi @Muennighoff, @Manouchehri and all, yes we're working on this. Maybe a bit more, we'll add ~3 coding tasks to MTEB :)

reason behind this is we're training a coding embedding model, similar as: jina-embeddings-v2 :)

Amazing really looking forward to it! Let us know if we can help :)

Muennighoff avatar Dec 06 '23 16:12 Muennighoff

Hello everyone! I'll be working on getting this integrated :hugs:

cassanof avatar Apr 11 '24 21:04 cassanof