BERT-NER icon indicating copy to clipboard operation
BERT-NER copied to clipboard

Pytorch-Named-Entity-Recognition-with-BERT

Results 33 BERT-NER issues
Sort by recently updated
recently updated
newest added

Hi, As BERT tokenization only supports tokenization of sentence upto 512 so if my text length is greater than 512 How can I proceed? I used BertForTokenClassification for entity recognition...

#bert.py#49l def tokenize(self, text: str): """ tokenize input""" words = word_tokenize(text) tokens = [] valid_positions = [] **for i,word in enumerate(words):** token = self.tokenizer.tokenize(word) tokens.extend(token) **for i in range(len(token)):** **if...

I would love to use train this on the bert-large corpus and then fine-tune it on the BC5CDR-chem corpus for NER and then use it to predict on unlabelled raw...

Hi, Thanks for sharing code, I just want to talk about the "num_train_epochs". How many epochs are enough for ner task

CoNll dataset - can predict only 5 tags ontonotes- can predict around 18 tags will the code work fine if i just replace the dataset with ontonotes or are there...

its not an floating point issue between devices and then only the certainty changes **on GPU i get (4/18) on CPU (16/18)** in order to hard code GPU and then...

The supported sequence length of BERT is up to 512 tokens. Adding a simple sentence tokenization to API would enable users to process longer texts.

In the function `convert_examples_to_features`, each word may be split into >1 word by the BERT tokenizer ``` token = tokenizer.tokenize(word) tokens.extend(token) ``` but the length of labels remains the same....

Hello, thank you for your great work! The F1 score can reach a high level mentioned in this repo by the experiment branch. However, when I tried to train the...