im2latex-tensorflow
im2latex-tensorflow copied to clipboard
Got so many errors, but why raw input? and also can't decode 0xe7 in utf-8 codec
When I ran this code in Google colab, i got this error How would I solve this? any ideas? I dont know anything about ML and TF though
I just cloned the repository and ran the below code in colab:
%cd /content/
%cd /content/im2latex-tensorflow/im2markup
!python scripts/preprocessing/preprocess_images.py --input-dir ../formula_images --output-dir ../images_processed
!python scripts/preprocessing/preprocess_formulas.py --mode normalize --input-file ../im2latex_formulas.lst --output-file formulas.norm.lst
!python scripts/preprocessing/preprocess_filter.py --filter --image-dir ../images_processed --label-path formulas.norm.lst --data-
path ../im2latex_train.lst --output-path train.lst
!python scripts/preprocessing/preprocess_filter.py --filter --image-dir ../images_processed --label-path formulas.norm.lst --data-
path ../im2latex_validate.lst --output-path validate.lst
!python scripts/preprocessing/preprocess_filter.py --no-filter --image-dir ../images_processed --label-path formulas.norm.lst
--data-path ../im2latex_test.lst --output-path test.lst
!python scripts/preprocessing/generate_latex_vocab.py --data-path train.lst --label-path formulas.norm.lst --output-file
latex_vocab.txt
!python /content/attention.py
Output:
/content/im2latex-tensorflow/im2markup
2021-07-22 07:01:50,172 root INFO Script being executed: scripts/preprocessing/preprocess_images.py
[240, 100]
Traceback (most recent call last):
File "scripts/preprocessing/preprocess_images.py", line 103, in <module>
main(sys.argv[1:])
File "scripts/preprocessing/preprocess_images.py", line 91, in main
raw_input()
NameError: name 'raw_input' is not defined
2021-07-22 07:01:50,343 root INFO Script being executed: scripts/preprocessing/preprocess_formulas.py
Traceback (most recent call last):
File "scripts/preprocessing/preprocess_formulas.py", line 86, in <module>
main(sys.argv[1:])
File "scripts/preprocessing/preprocess_formulas.py", line 64, in main
fout.write(open(output_file).read().replace('\r', ' ')) # delete \r
File "/usr/lib/python3.7/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 854238: invalid continuation byte
2021-07-22 07:01:50,724 root INFO Script being executed: scripts/preprocessing/preprocess_filter.py
Traceback (most recent call last):
File "scripts/preprocessing/preprocess_filter.py", line 115, in <module>
main(sys.argv[1:])
File "scripts/preprocessing/preprocess_filter.py", line 78, in main
labels = open(parameters.label_path).readlines()
File "/usr/lib/python3.7/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 2270: invalid continuation byte
2021-07-22 07:01:51,033 root INFO Script being executed: scripts/preprocessing/preprocess_filter.py
Traceback (most recent call last):
File "scripts/preprocessing/preprocess_filter.py", line 115, in <module>
main(sys.argv[1:])
File "scripts/preprocessing/preprocess_filter.py", line 78, in main
labels = open(parameters.label_path).readlines()
File "/usr/lib/python3.7/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 2270: invalid continuation byte
2021-07-22 07:01:51,362 root INFO Script being executed: scripts/preprocessing/preprocess_filter.py
2021-07-22 07:01:51,396 root INFO 0 discarded. 0 not found in ../images_processed.
2021-07-22 07:01:51,397 root INFO Jobs finished
2021-07-22 07:01:51,532 root INFO Script being executed: scripts/preprocessing/generate_latex_vocab.py
Traceback (most recent call last):
File "scripts/preprocessing/generate_latex_vocab.py", line 80, in <module>
main(sys.argv[1:])
File "scripts/preprocessing/generate_latex_vocab.py", line 49, in main
formulas = open(label_path).readlines()
File "/usr/lib/python3.7/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 2270: invalid continuation byte
python3: can't open file '/content/attention.py': [Errno 2] No such file or directory