bert-relation-classification icon indicating copy to clipboard operation
bert-relation-classification copied to clipboard

Error while trying to use the model

Open sandeeppilania opened this issue 5 years ago • 17 comments

Traceback (most recent call last): File "bert.py", line 429, in main() File "bert.py", line 373, in main config, config.task_name, tokenizer, evaluate=False) File "bert.py", line 268, in load_and_cache_examples examples, label_list, config.max_seq_len, tokenizer, "classification", use_entity_indicator=confi g.use_entity_indicator) File "C:\Users\pilanisp\Desktop\BERT FINAL\BERT IE\bert-relation-classification\utils.py", line 281 , in convert_examples_to_features e11_p = tokens_a.index("#")+1 # the start position of entity1 ValueError: '#' is not in list

sandeeppilania avatar Jan 13 '20 19:01 sandeeppilania

I have the same issue!

bilalghanem avatar Jan 17 '20 15:01 bilalghanem

me too

vpvsankar avatar Jan 24 '20 10:01 vpvsankar

I think the authors was planning to use E11, E21, etc. but then changed the code to use # & $.

What I have done to solve the issue is that when I read the data in the beginning of the code, I convert the special tokens as the following:

E11 & E21 -> # E21 & E22 -> $

and then everything worked perfectly.

bilalghanem avatar Jan 24 '20 11:01 bilalghanem

@bilalghanem Can you share an example of how you converted the training example. Did you change the entire train.tsv at first or are you changing it at as you read through the file in the code

sandeeppilania avatar Feb 12 '20 16:02 sandeeppilania

@sandeeppilania I change it in the code. Simply, in function convert_examples_to_features, before the line: l = len(tokens_a), use .replace to convert them.

ex.

str.replace('E11', '#')
etc.

bilalghanem avatar Feb 12 '20 16:02 bilalghanem

@bilalghanem I am asking something silly here sorry about that, but on line tokens_a = tokenizer.tokenize(example.text_a) in function convert_examples_to_features i tried printing out tokens_a and this is what i am able to see: ['the', 'system', 'as', 'described', 'above', 'has', 'its', 'greatest', 'application', 'in', 'an', 'array', '##ed', '[', 'e', '##11', ']', 'configuration', '[', 'e', '##12', ']', 'of', 'antenna', '[', 'e', '##21', ']', 'elements', '[', 'e', '##22', ']'] so i dont how the replace str.replace('E11', '#') would work here

sandeeppilania avatar Feb 12 '20 19:02 sandeeppilania

@bilalghanem I am asking something silly here sorry about that, but on line tokens_a = tokenizer.tokenize(example.text_a) in function convert_examples_to_features i tried printing out tokens_a and this is what i am able to see: ['the', 'system', 'as', 'described', 'above', 'has', 'its', 'greatest', 'application', 'in', 'an', 'array', '##ed', '[', 'e', '##11', ']', 'configuration', '[', 'e', '##12', ']', 'of', 'antenna', '[', 'e', '##21', ']', 'elements', '[', 'e', '##22', ']'] so i dont how the replace str.replace('E11', '#') would work here

sorry, you're right .. before applying the tokenizer. or even when u start reading the data.

bilalghanem avatar Feb 12 '20 19:02 bilalghanem

Got it, So basically, 0 the system as described above has its greatest application in an arrayed [E11] configuration [E12] of antenna [E21] elements [E22] 12 whole component 2 should be converted to 0 the system as described above has its greatest application in an arrayed #configuration# of antenna $elements$ 12 whole component 2 Right? because my understaning is that e11_p = tokens_a.index("#")+1 is looking for just next offset of #.

sandeeppilania avatar Feb 12 '20 20:02 sandeeppilania

@sandeeppilania yes, exactly.

And this line specifies the end entity in case its length is more than single word. e12_p = l-tokens_a[::-1].index("#")+1

bilalghanem avatar Feb 13 '20 08:02 bilalghanem

@sandeeppilania hi, brother... i have the same issue.. can you share the part of the code where exactly did you made changes to solve the problem..

thanks in advance

ejokhan avatar Feb 16 '20 17:02 ejokhan

I think the authors was planning to use E11, E21, etc. but then changed the code to use # & $.

What I have done to solve the issue is that when I read the data in the beginning of the code, I convert the special tokens as the following:

E11 & E21 -> # E21 & E22 -> $

and then everything worked perfectly.

You mean E11 & E12?

Valdegg avatar Feb 18 '20 15:02 Valdegg

I wonder why they didn't try running the software before they posted it here (and explicitly said it's "stable", when it doesn't even run)...

Valdegg avatar Feb 18 '20 15:02 Valdegg

@sandeeppilania hi, brother... i have the same issue.. can you share the part of the code where exactly did you made changes to solve the problem..

thanks in advance

I think the authors was planning to use E11, E21, etc. but then changed the code to use # & $. What I have done to solve the issue is that when I read the data in the beginning of the code, I convert the special tokens as the following: E11 & E21 -> # E21 & E22 -> $ and then everything worked perfectly.

You mean E11 & E12?

Please check the following lines in bert.py. uncomment the line you need. #additional_special_tokens = ["[E11]", "[E12]", "[E21]", "[E22]"] additional_special_tokens = [] #additional_special_tokens = ["e11", "e12", "e21", "e22"]

wang-h avatar Mar 03 '20 12:03 wang-h

Hey guys, look here!Modify the "additional_special_tokens" in the file "bert.py" so that it corresponds to the file "util.py", and pay attention to the starting and ending subscript positions in the file "util.py"; If necessary, modify about 275 lines of code in the file "util.py". After finishing, I can start training. I have tried this method and it is effective.

伙计,来瞧瞧这!修改bert.py文件中的additional_special_tokens使它和util.py文件中的tokens_a对应,同时注意开始和结束的下标位置;必要情况修改util.py文件中的275行左右的代码。完成后即可开始训练,我已经尝试了这种方法,是有效的。

seesky8848 avatar May 18 '22 14:05 seesky8848

I am sorry I have no time to correct the code, the error raises when you are using a modern transformer library. model = XXX.from_pretrained(args.bert_model, args=args) tokenizer.add_tokens(additional_special_tokens) add the following line!!!
model.resize_token_embeddings(len(tokenizer))

wang-h avatar Jun 21 '22 13:06 wang-h

I am sorry I have no time to correct the code, the error raises when you are using a modern transformers library. model = XXX.from_pretrained(args.bert_model, args=args) tokenizer.add_tokens(additional_special_tokens) add the following line!!!
model.resize_token_embeddings(len(tokenizer))

wang-h avatar Jun 21 '22 13:06 wang-h

OK,thank you😲

---- Replied Message ---- | From | @.> | | Date | 06/21/2022 21:38 | | To | @.> | | Cc | @.@.> | | Subject | Re: [wang-h/bert-relation-classification] Error while trying to use the model (#4) |

I am sorry I have no time to correct the code, the error raises when you are using a modern transformers library. model = XXX.from_pretrained(args.bert_model, args=args) tokenizer.add_tokens(additional_special_tokens) add the following line!!! model.resize_token_embeddings(len(tokenizer))

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

seesky8848 avatar Jun 21 '22 15:06 seesky8848