RBERT
RBERT copied to clipboard
Implementation of BERT in R
Hi, I am trying to run RBERT in tensorflow on small dataset. I have installed Tensorflow using the miniconda environment. Below is the code which throws the error: ``` Sys.setenv(RETICULATE_PYTHON...
Might be Windows-specific. Downloading "bert_large_uncased_wwm" failed (the actual download.file step). Removing `method = "libcurl"` fixed it. I don't remember why we specify the method, need to try on different OSs....
currently, we test with BERT_base, but we may as well use the smallest available.
It doesn't complain right away, but if you run enough models, you get a message like: > WARNING:tensorflow:5 out of the last 6 calls to triggered tf.function retracing. Tracing is...
currently, I attach tt_ids as an attribute to the tokenized input in `tokenize_input`. It feels like a misuse of attributes, but I also don't want to, say, pass the tt_ids...
I think we decided this would be a good thing to do.
The filename says it all. These were hastily written to get the branch into a working state, and should be refactored with more attention paid to speed and safety.
There are several tokenization conventions (e.g. the token used for padding, separating segments, etc.) that need to be specified when doing the wordpiece tokenization for BERT. Currently, some of these...
`extract_features`, `download_BERT_checkpoint`, and probably some other functions use the model parameter, with a hard-coded list of models. Investigate listing those models in one place and automatically updating the formals of...