BERT-WSD icon indicating copy to clipboard operation
BERT-WSD copied to clipboard

help

Open Aliyasjohn opened this issue 3 years ago • 2 comments

Sir, can you explain what should i do to implement WSD using BERT for any other language?

Aliyasjohn avatar Apr 01 '22 13:04 Aliyasjohn

HI, thanks for your question.

For other language, you would need:

  1. A checkpoint of BERT pretrained on a corpus of your desired language;
  2. Training and testing dataset in the desired language preprocessed into .csv files with the columns ["id", "sentence", "sense_keys", "glosses", "targets"].

Once you have gathered the model checkpoint and datasets, you should be able to finetune the checkpoint using the following command:

python script/run_model.py \
    --do_train \
    --evaluate_during_training \
    --train_path "<path_to_training_csv>" \
    --eval_path "<path_to_validation_csv>" \
    --model_name_or_path "<path_to_pretrained_checkpoint>" \
    --output_dir "model/finetuned" \
    --per_gpu_train_batch_size 8 \
    --gradient_accumulation_steps 16 \
    --learning_rate 2e-5 \
    --num_train_epochs 4 \
    --logging_steps 1000 \
    --save_steps 1000

Hope this helps. Cheers.

BPYap avatar Apr 02 '22 03:04 BPYap

Thank you so much sir for your assistance with this matter. I will let you know if i require any further help. thanks! :)

Aliyasjohn avatar Apr 04 '22 06:04 Aliyasjohn