BLINK
BLINK copied to clipboard
How to train on a new knowledge base?
It seems that a lot of people asked for training BLINK for a new knowledge base (i.e. a set of entities + descriptions), but unfortunately I couldn't find relevant information.
May I ask you to add just a quick guide here please?
I've created a new repository for training bi-encoder models, following this tutorial you can train the model in a newer (or in another language) Wikipedia dump using the BLINK code or following this tutorial
@Giovani-Merlin It seems those tutorial links you posted are no longer working, could you repost them?
@amirj: You can look at this tutorial https://github.com/facebookresearch/BLINK/issues/116
I've created a new repository for training bi-encoder models, following this tutorial you can train the model in a newer (or in another language) Wikipedia dump using the BLINK code or following this tutorial
The link seems to be 404, could u please update to the right link @Giovani-Merlin . Thx a lot~
I've created a new repository for training bi-encoder models, following this tutorial you can train the model in a newer (or in another language) Wikipedia dump using the BLINK code or following this tutorial
@Giovani-Merlin : Can you provide access to the mentioned repository ?
@Giovani-Merlin I would be also very grateful for the access to your tutorial:)
@viraj-lakshitha @gromajus @kongmoumou @driscoll42 Hello! Sorry, a bit late, but I needed to make considerable changes in the tutorials/repo as I was unsatisfied with the final results. I've split the repo into two parts:
WBDSM for creating the dataset (for any Wikipedia dump in any language) https://github.com/Giovani-Merlin/wbdsm for creating the dataset
Bet for training bi-encoder models: https://github.com/Giovani-Merlin/bet
The results are fantastic. You can follow this process illustrated here https://github.com/Giovani-Merlin/bet/blob/main/docs/results.md to train a custom model or to benchmark with Zeshel dataset.
If you have any doubts/issues please use the respective repo issues part :) Later on I will improve the tutorials/documentation
I won't have time for a few weeks, but I will definitely give this a shot. Thanks for updating it!