Results 45 comments of pere

Do you have any timeframe for the release of multilingual Albert weights?

@0x0539 Any chance you can give an update on this? Is it worth waiting for a multilingual Albert?

Thanks @ngoanpv! Just what I needed to know. For preparing the dataset, did you build your own custom vocabulary using Sentencepiece? Did you use the default dupe_factor of 40? If...

Just made some tests. I am not able to get as high batch size as you mentioned, @ngoanpv. I am able to train on a batch_size of 392 on the...

Hi @stefan-it. Commenting with my experiences here. I ended up using model_type=bpe, since the Bert paper mentions that WordPiece is similar to bpe. I cant find native support for building...

Thanks, @ngoanpv. Tried again, and was able to get it above 500 by turning off the dropout-layers.

@steindor. Sorry for the confusion. I mean a v3-8 with 128GB of memory. 8 cores. I have done quite a lot of experiments on this after this post was written....

I have a domain specific (medical) English corpus that I want to do some additional pre-training on from the Bert checkpoint. However, there are quite a lot of words in...

@samreenkazi. I ended up using Spacy to make a list of all the words in a portion of the corpus. There is easy built in functions for listing for instance...

please contact me on "per at capia dot no". Ill send you the code. On Tue, 30 Apr 2019 at 10:53, Samreen Kazi wrote: > Yes I would like to...