im2latex-tensorflow icon indicating copy to clipboard operation
im2latex-tensorflow copied to clipboard

Got so many errors, but why raw input? and also can't decode 0xe7 in utf-8 codec

Open GhostOps77 opened this issue 4 years ago • 0 comments

When I ran this code in Google colab, i got this error How would I solve this? any ideas? I dont know anything about ML and TF though

I just cloned the repository and ran the below code in colab:

%cd /content/
%cd /content/im2latex-tensorflow/im2markup

!python scripts/preprocessing/preprocess_images.py --input-dir ../formula_images --output-dir ../images_processed
!python scripts/preprocessing/preprocess_formulas.py --mode normalize --input-file ../im2latex_formulas.lst --output-file formulas.norm.lst

!python scripts/preprocessing/preprocess_filter.py --filter --image-dir ../images_processed --label-path formulas.norm.lst --data-
path ../im2latex_train.lst --output-path train.lst
!python scripts/preprocessing/preprocess_filter.py --filter --image-dir ../images_processed --label-path formulas.norm.lst --data-
path ../im2latex_validate.lst --output-path validate.lst
!python scripts/preprocessing/preprocess_filter.py --no-filter --image-dir ../images_processed --label-path formulas.norm.lst 
--data-path ../im2latex_test.lst --output-path test.lst
!python scripts/preprocessing/generate_latex_vocab.py --data-path train.lst --label-path formulas.norm.lst --output-file 
latex_vocab.txt

!python /content/attention.py

Output:

/content/im2latex-tensorflow/im2markup
2021-07-22 07:01:50,172 root  INFO     Script being executed: scripts/preprocessing/preprocess_images.py
[240, 100]
Traceback (most recent call last):
  File "scripts/preprocessing/preprocess_images.py", line 103, in <module>
    main(sys.argv[1:])
  File "scripts/preprocessing/preprocess_images.py", line 91, in main
    raw_input()
NameError: name 'raw_input' is not defined
2021-07-22 07:01:50,343 root  INFO     Script being executed: scripts/preprocessing/preprocess_formulas.py
Traceback (most recent call last):
  File "scripts/preprocessing/preprocess_formulas.py", line 86, in <module>
    main(sys.argv[1:])
  File "scripts/preprocessing/preprocess_formulas.py", line 64, in main
    fout.write(open(output_file).read().replace('\r', ' ')) # delete \r
  File "/usr/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 854238: invalid continuation byte
2021-07-22 07:01:50,724 root  INFO     Script being executed: scripts/preprocessing/preprocess_filter.py
Traceback (most recent call last):
  File "scripts/preprocessing/preprocess_filter.py", line 115, in <module>
    main(sys.argv[1:])
  File "scripts/preprocessing/preprocess_filter.py", line 78, in main
    labels = open(parameters.label_path).readlines()
  File "/usr/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 2270: invalid continuation byte
2021-07-22 07:01:51,033 root  INFO     Script being executed: scripts/preprocessing/preprocess_filter.py
Traceback (most recent call last):
  File "scripts/preprocessing/preprocess_filter.py", line 115, in <module>
    main(sys.argv[1:])
  File "scripts/preprocessing/preprocess_filter.py", line 78, in main
    labels = open(parameters.label_path).readlines()
  File "/usr/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 2270: invalid continuation byte
2021-07-22 07:01:51,362 root  INFO     Script being executed: scripts/preprocessing/preprocess_filter.py
2021-07-22 07:01:51,396 root  INFO     0 discarded. 0 not found in ../images_processed.
2021-07-22 07:01:51,397 root  INFO     Jobs finished
2021-07-22 07:01:51,532 root  INFO     Script being executed: scripts/preprocessing/generate_latex_vocab.py
Traceback (most recent call last):
  File "scripts/preprocessing/generate_latex_vocab.py", line 80, in <module>
    main(sys.argv[1:])
  File "scripts/preprocessing/generate_latex_vocab.py", line 49, in main
    formulas = open(label_path).readlines()
  File "/usr/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe7 in position 2270: invalid continuation byte
python3: can't open file '/content/attention.py': [Errno 2] No such file or directory

GhostOps77 avatar Jul 22 '21 07:07 GhostOps77