oletools
oletools copied to clipboard
olevba: UnicodeEncodeError when redirecting output to a file on Windows with Python 3
When the code of a VBA macro contains non-ASCII characters, olevba triggers a UnicodeEncodeError when the console output is redirected to a file, on Windows 10 with Python 3. The same file is processed properly when doing the same with Python 2, or when the output is printed on the console instead of being redirected. Sample file: test.zip (borrowed from https://github.com/kirk-sayre-work/talks/blob/master/test.docm) Output:
c:\...>ovba3 test.docm >olevba.txt
Traceback (most recent call last):
File "c:\Users\x\oletools\oletools\olevba.py", line 4077, in process_file
print(vba_code_filtered)
File "C:\Program Files\Python39\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in position 1197: character maps to <undefined>
ERROR Error processing file test.docm ('charmap' codec can't encode character '\ufffd' in position 1197: character maps to <undefined>)!
Traceback (most recent call last):
File "c:\Users\x\oletools\oletools\olevba.py", line 4077, in process_file
print(vba_code_filtered)
File "C:\Program Files\Python39\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in position 1197: character maps to <undefined>
Philippe, may I suggest adding an option -o outputfile so that we don't have to use > redirection. This makes it easy to use in commands like FOR /R %F IN (*.xls *.xlsb *.xlsm) DO olevba "%F" -o "%F".VBA
And maybe set the errorlevel on exit so 0 = no VBA or macros found 1 = XLM macros 2 = VBA macros 4 = malicious suspect etc
and then I could chain a command & IF errorlevel 0 DEL "%F".VBA or maybe && DEL "%F".VBA if I have that syntax right.