Zhibin (Louis) Lu comments

Results 29 comments of


                                            Zhibin (Louis) Lu

I don't see you do truncate when the sequence longer than 512 lengths?

I wrote a [Bert+CRF](https://github.com/Louis-udm/NER_BERT_CRF) model, which has 92.29% f1 on test data, a little higher performance than yours, but a little bit worse than the paper's 92.4 %, why?

Is there a publication?

Hi, No, basically I have no intention of writing a paper for this. Thank you for your attention.

error when i use a bert variants?

Hi, I didn't try the bert variants. but I think the modification is not large. you need change the model code file.

Very long training time !!!

Hi, I think that is not normal. I used Tesla K40c for this paper, it depends the size of the data set, but in my impression, training SST-2 takes up...

Very long training time !!!

It's probably that you have a very big graph. You can find ways to reduce the vocaburary and delete some edges from the graph.

run code and so many problems occurred, would you mind export the entire requirements?

Sorry I have lost that virtual environment, but I gave the most major dependency libraries.

Combining other type of embeddings

Hi, It seems it's at least one of the right ways.

Combining other type of embeddings

> Thanks for your reply! > > I could run the model with that addition, but it didn't make much difference in the end score. > > Now I am...

Combining other type of embeddings

@jaytimbadia Thank you for your attention. Both the GCN adjacency matrix and PMI are based on previous works, you can find many GCN-related papers and PMI papers. You can try...

Ask the author to answer，thank you！

Hi, Thank you for your attention. It's just a torch.nn.Embedding, 768 dimension, the initialization function is torch.init.normal_ Just look at the source and do a little search.