Per E Kummervold comments

Results 58 comments of


                                            Per E Kummervold

How to use my own additional vocabulary dictionary?

@irhallac it is the [unusedXXX]-tokens that can be replaced with any word you like. I am running some experiments on how effective this really is, but from my understanding you...

How to use my own additional vocabulary dictionary?

They are in two bulks. From line#2 [unused0] to line #100[unused98]. Then there are 4 tokens that absolutely should not be changed [UNK]|CLS][SEP][MASK]. Then they continue. line #105 to line...

How to use my own additional vocabulary dictionary?

OK. I did not know. Then it is only the uncased version that has 1000 unused spots.

How to use my own additional vocabulary dictionary?

@irhallac Let me post an update on my experiences with using vocab-files during pretraining on a domain specific corpus. As far as I know, the only reasonable way to test...

How to use my own additional vocabulary dictionary?

I did a few more tests on this (as I mentioned in another post). I am no longer convinced about my own results. The challenge is that fine-tuning has a...

How to use my own additional vocabulary dictionary?

@muhammadfahid51 If I understand things correctly, Bert works on token level. In addition it learns multi-token embeddings. Lets say we have the word "goodness". Lets say this does not exist...

How to use my own additional vocabulary dictionary?

@muhammadfahid51 Dont interpret any of this as "correct" answers. I am just another researcher struggling with the same issues. You can use Sentencepiece to build any vocabulary from scratch. Sentencepiece...

How to use my own additional vocabulary dictionary?

@muhammadfahid51 Take a look at this page: https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages

How to use my own additional vocabulary dictionary?

Absolutely. Doing additional domain specific pretraining is very effective. How effective will depend on your task and corpus. Lots of examples of its efficiancy. Here is just one example: https://arxiv.org/pdf/2005.07503...

How to use my own additional vocabulary dictionary?

@nagads I understand your question, and have gotten it several times before. I usually answers them "Ill tell you, if you first can tell me what a boat costs!". It...