xlnet-Pytorch
xlnet-Pytorch copied to clipboard
Simple XLNet implementation with Pytorch Wrapper
The default setting is to use the bidirectional data, attn_type='bi', but bsz=1. But in this function, https://github.com/graykode/xlnet-Pytorch/blob/cb793a1c75bdc59e3360f04ec641af726719811f/xlnet.py#L371 It shows the bidirectional data only works when bsz%2 ==0. However in default,...
It seems that the parameter initialized with **randn** (https://github.com/graykode/xlnet-Pytorch/blob/cb793a1c75bdc59e3360f04ec641af726719811f/xlnet.py#L119) will **lead to low-performance**, and I tried **xavier_norm** and **kaiming_uniform**, both reach a much higher AUC and F1 score in my...
https://github.com/graykode/xlnet-Pytorch/blob/cb793a1c75bdc59e3360f04ec641af726719811f/xlnet.py#L163 In your implementation, the FFN module only has one linear layer. is it a bug?
 When I run the code on the colab, I got the above error. I wonder where did I do wrong and what is the suitable codes environment requirements. Thank...
@graykode Can you explain how batch training would be conducted? For example, what if we had multiple input files for training data? Currently, training is done using only a single...
First the error — I get this both when trying to run the notebook locally (ubuntu 18.04) and from Colab: ``` Traceback (most recent call last): File "main.py", line 89,...
RuntimeError: Expected object of scalar type Byte but got scalar type Bool for argument #2 'other'
If I run the code with default arguments (and use data.txt from the repository) I get the following message: ``` Traceback (most recent call last): File "C:/Users/matej/git/xlnet-Pytorch/main.py", line 89, in...
Usage code runs successfully, but I can't find a saved parameters.
I want to do sentence embedding using your pytorch code. but i do not find how to make test data input and inference code
can you help me ? D:\xlnet-Pytorch-master>python main.py Model name 'bert-large-uncased' was not found in model name list (bert-base-uncased, bert-large-uncased, bert-base-cased, bert-large-cased, bert-base-multilingual-uncased, bert-base-multilingual-cased, bert-base-chinese). We assumed 'https://s3.amazonaws.com/models.huggingface.co/bert/bert-large-uncased-vocab.txt' was a path...