bertram icon indicating copy to clipboard operation
bertram copied to clipboard

Example script for downstream task

Open pribadihcr opened this issue 5 years ago • 3 comments

hi @timoschick,

Any plan to provide example code for the downstream tasks mentioned in the paper. Thanks.

pribadihcr avatar Sep 10 '20 16:09 pribadihcr

Hi @pribadihcr, yes, we definitely plan to release the example code. However, the code is still kind of messy and I am really busy right now so it might take a few weeks :(

timoschick avatar Sep 11 '20 09:09 timoschick

Hi @timoschick,

Could you give me some hint the classification task's procedure?. e.g how to use bertram pretrained etc

pribadihcr avatar Sep 28 '20 01:09 pribadihcr

Sure, this is relatively straightforward. You can start from any of the examples found here, or you can use any other script that uses HuggingFace's Transformers for any downstream task. Before training/evaluating, you just need to load the BERTRAM model corresponding to the pretrained model used in the script and call:

 bertram.add_word_vectors_to_model(words_with_contexts, tokenizer, model)

where bertram is your BERTRAM model, tokenizer and model are the tokenizer and model used in your training/evaluation script, and words_with_contexts is a dictionary from rare words to a list of contexts in which they occur. For each rare word w, this will add a new token <BERTRAM:w> to the model's (and tokenizer's) embedding space. When processing examples for training/evaluation, you then simply need to replace each occurrence of a rare word w with <BERTRAM:w> (or with <BERTRAM:w> / w if you wish to use the "slash"-variant described in the paper).

timoschick avatar Oct 02 '20 07:10 timoschick