Zhibin (Louis) Lu
Zhibin (Louis) Lu
I wrote a [Bert+CRF](https://github.com/Louis-udm/NER_BERT_CRF) model, which has 92.29% f1 on test data, a little higher performance than yours, but a little bit worse than the paper's 92.4 %, why?
Hi, No, basically I have no intention of writing a paper for this. Thank you for your attention.
Hi, I didn't try the bert variants. but I think the modification is not large. you need change the model code file.
Hi, I think that is not normal. I used Tesla K40c for this paper, it depends the size of the data set, but in my impression, training SST-2 takes up...
It's probably that you have a very big graph. You can find ways to reduce the vocaburary and delete some edges from the graph.
Sorry I have lost that virtual environment, but I gave the most major dependency libraries.
Hi, It seems it's at least one of the right ways.
> Thanks for your reply! > > I could run the model with that addition, but it didn't make much difference in the end score. > > Now I am...
@jaytimbadia Thank you for your attention. Both the GCN adjacency matrix and PMI are based on previous works, you can find many GCN-related papers and PMI papers. You can try...
Hi, Thank you for your attention. It's just a torch.nn.Embedding, 768 dimension, the initialization function is torch.init.normal_ Just look at the source and do a little search.