qags
qags copied to clipboard
Issue while executing qg_utils.py: ValueError: invalid literal for int() with base 10: 'where'
While executing the qg_utils.py
,
Line# 132 in https://github.com/W4ngatang/qags/blob/master/qg_utils.py gives below issue.
ValueError: invalid literal for int() with base 10: 'where'
The string tokens in variable tok_str
are of str
type and thereby causing the issue.
I would like to ask if this is not the expected type of elements in tok_str
?
I'm facing the same issue. I believe this is some kind of legacy from the author (including GPT tokenizer decoding follows after) considering the log file we put contains plain texts as questions, and the fact that there is replacing lines for <s> and <mask>.
@W4ngatang correct me if I'm wrong.
I'm facing the same issue. I believe this is some kind of legacy from the author (including GPT tokenizer decoding follows after) considering the log file we put contains plain texts as questions, and the fact that there is replacing lines for
and. @W4ngatang correct me if I'm wrong.
Have you solved the problem ?@mriganktiwari @sonsus
Hey, I just encountered the same problem. Is there a solution?
Hi everyone,
I'm the next one with the same issue. Could someone solve it?
My solution was to write the raw
in the gen_fh
-file instead of decoding. Any comments on that?
Best,
Gisela
I never found the solution, and long back moved away from trying as well. If someone finds the solution please provide here.
My solution to this is to delete the tokenize step in qg_utils(line 135-136), because I assume that the questions in log file are what we need.
Wandering whether am I correct. :)