Congcong Wang
Congcong Wang
There is __pow__ method. Why still needs this method? Just curious.
Have a look at this. https://huggingface.co/transformers/model_doc/xlnet.html#xlnettokenizer Hope it helps.
Hope this link will help: https://github.com/wangcongcong123/AllenNLPonWins
My understand is that query is padded to the left while match_hist is to the right. When calculating the einsum between them, should this be consistent ?
其实可以考虑character-level的训练,不过那将改变vocabulary,可能要设计到预训练。如果条件允许的话,你可以试着预训练一个gpt2在大量代码数据上。
Have you tried down-version the transformers?
simply change "masked_lm_labels" to "labels" in the latest version
If you want to use the latest transformers, just change ` original_masked_lm_labels = [-1] * max_seq_length` line 200 in cbert_utils.py to ` original_masked_lm_labels = [-100] * max_seq_length`. Then here you...