Iz Beltagy comments

Results 38 comments of


Iz Beltagy

Upgrade to tensorflow 2.0

You might also want to give Longformer a shot, especially if you are working on an NLP task as it includes a pretrained model for long docs https://github.com/allenai/longformer (self-promotion :D)

It is just one function call: https://github.com/allenai/scibert/blob/master/scripts/cheatsheet.txt#L6 The output format is slightly different than what BERT expects, so we manually fixed after it was generated.

Huggingface support for Tensorflow?

I am not familiar with how HF TF support works, but as far as I understand, we don't need to do anything specific on the model side to make it...

gradient_accumulation_batch_size missing in trainer

Yes, AllenNLP doesn't support gradient accumulation. We have it implemented in our fork of allennlp (check requirements: https://github.com/allenai/scibert/blob/master/requirements.txt)

Scibert for text classification

@InesArous, you can try to follow one of the classification examples in the HF code https://github.com/huggingface/transformers/tree/master/examples/text-classification, maybe the `run_pl_glue.py` one.

Scibert for text classification

@amandalmia14, you need to use `AutoModelForSequenceClassification ` instead of `AutoModel`

Language Distribution

It is just English.

JNLPBA dataset

@shreyashub, I think you are talking about bc5cdr not JNLPBA because JNLPBA doesn't have Disease category. For bc5cdr, we used a version that we had in s2 that dropped the...

Problem with datasaet

sorry for the confusion. citation_intent: SciCite mag: Paper Field

scibert throws InvalidTagSequence(tag_sequence) for BIOUL

This is an AllenNLP issue. Can you share the error stack trace?