scibert icon indicating copy to clipboard operation
scibert copied to clipboard

Scibert for text classification

Open InesArous opened this issue 4 years ago • 3 comments

Hi,

Thanks for your awesome work! I would like to use SciBERT for text classification. I managed to get some results by directly using the script train_allennlp_local.sh with modifying the task field as described in the readme file. However, I am not able to get the same results using Huggingface's framework. Is there are any available resources/tutorials on how to make the equivalence between the two? Thanks!

InesArous avatar Jun 28 '20 17:06 InesArous

@InesArous I was able to train / finetune the BERT for text classification, however if I replace the actual bert sequence classification to below and change the tokenizer,

from: tokenizer = BertTokenizer.from_pretrained('bert-base-uncased', do_lower_case=True) model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=len(label_dict), output_attentions=False, output_hidden_states=False) to: tokenizer = AutoTokenizer.from_pretrained('allenai/scibert_scivocab_uncased') model = AutoModel.from_pretrained('allenai/scibert_scivocab_uncased')

I get an error while training the model,

` TypeError Traceback (most recent call last) in 17 }
18 ---> 19 outputs = model(**inputs) 20 21 loss = outputs[0]

d:\multi_class_text_classification\venv\lib\site-packages\torch\nn\modules\module.py in _call_impl(self, *input, **kwargs) 720 result = self._slow_forward(*input, **kwargs) 721 else: --> 722 result = self.forward(*input, **kwargs) 723 for hook in itertools.chain( 724 _global_forward_hooks.values(),

TypeError: forward() got an unexpected keyword argument 'labels'

`

amandalmia14 avatar Sep 23 '20 08:09 amandalmia14

@InesArous, you can try to follow one of the classification examples in the HF code https://github.com/huggingface/transformers/tree/master/examples/text-classification, maybe the run_pl_glue.py one.

ibeltagy avatar Sep 24 '20 04:09 ibeltagy

@amandalmia14, you need to use AutoModelForSequenceClassification instead of AutoModel

ibeltagy avatar Sep 24 '20 04:09 ibeltagy