Allan Jie comments

Results 40 comments of


Allan Jie

Experiments for MAWPS-s

Got it. Maybe should specify them in the table/paper? From the table, it seems only those marked with "*" are train-test split.

In the SVAMP paper, the appendix A show that the transformers with Roberta encoder obtain 38.9 accuracy ![image](https://user-images.githubusercontent.com/3351187/141454892-c8649842-2501-4423-9b9d-030ea9f3a565.png) But it seems the RobertaGen only get 30.3 here. Curious about the...

Using the same dataset to train and evaluate the model can't reach to 100% F1 score

Thanks. Can you let me know which version of this repo you are using? (PyTorch or DyNet)?

Using the same dataset to train and evaluate the model can't reach to 100% F1 score

Are you able to overfit your dataset with a normal lstm crf model?

About the training of contextual embedding

I did not train ELMo but using existing pretrained ELMo model from different languages. Usually people pretrain them following the code add provided by AllenNLP, let me know if you...

LayoutLMv2: DocVQA labeling rules (heuristic matching)

Any updates on this?

Reproducing Performance on DocVQA using LayoutLMv3/LayoutLMv2

Thanks. Is it possible to provide the details about how you did it for this dataset? I think this could be important to reproduce the performance and better help the...

CUDA Out-of-memory using V100

Changing 4 to 3 works for me though. 😞

Performance on CoNLL-2003

Sorry for the late reply, but which layer of the hidden state you use? average or the final layer

Performance on CoNLL-2003

Thanks, I also found `weighted average` in [neuralnets/ELMoWordEmbeddings.py](https://github.com/UKPLab/elmo-bilstm-cnn-crf/blob/HEAD/neuralnets/ELMoWordEmbeddings.py#L104), can I ask why it is just simply swaping the axes? If I'm not wrong, the `0` dimension is the layer and...