slack-export-viewer icon indicating copy to clipboard operation
slack-export-viewer copied to clipboard

UnicodeEncodeError: 'charmap' codec can't encode character '\u202f' in position 54: character maps to <undefined> encoding with 'cp437' codec failed

Open codejoey opened this issue 1 year ago • 16 comments

Tried to view an export zip via Mac terminal.

Ran into this error which halts it. Any ideas?

Traceback (most recent call last): File "/Users/Joey/.local/bin/slack-export-viewer", line 10, in sys.exit(main()) ^^^^^^ File "/Users/Joey/Library/Application Support/pipx/venvs/slack-export-viewer/lib/python3.12/site-packages/click/core.py", line 1157, in call return self.main(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Joey/Library/Application Support/pipx/venvs/slack-export-viewer/lib/python3.12/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) ^^^^^^^^^^^^^^^^ File "/Users/Joey/Library/Application Support/pipx/venvs/slack-export-viewer/lib/python3.12/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Joey/Library/Application Support/pipx/venvs/slack-export-viewer/lib/python3.12/site-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Joey/Library/Application Support/pipx/venvs/slack-export-viewer/lib/python3.12/site-packages/slackviewer/main.py", line 79, in main configure_app(app, archive, channels, no_sidebar, no_external_references, debug) File "/Users/Joey/Library/Application Support/pipx/venvs/slack-export-viewer/lib/python3.12/site-packages/slackviewer/main.py", line 21, in configure_app path = extract_archive(archive) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Joey/Library/Application Support/pipx/venvs/slack-export-viewer/lib/python3.12/site-packages/slackviewer/archive.py", line 74, in extract_archive info.filename = info.filename.encode("cp437").decode("utf-8") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/Cellar/[email protected]/3.12.2_1/Frameworks/Python.framework/Versions/3.12/lib/python3.12/encodings/cp437.py", line 12, in encode return codecs.charmap_encode(input,errors,encoding_map) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ UnicodeEncodeError: 'charmap' codec can't encode character '\u202f' in position 54: character maps to encoding with 'cp437' codec failed

codejoey avatar Mar 01 '24 02:03 codejoey

I managed to get something working by first extracting the archive in a new directory and then point the viewer to that. I know, workaround, but haven't debugged it. It was trying to decore an utf-8 string using cp437.

gunchev avatar Apr 10 '24 19:04 gunchev

I ran into the same error. Did you happen to export from slackdump? It might be that slackdump is using UTF-8 instead of cp437 for the zip file.

BLiu1 avatar Jul 13 '24 05:07 BLiu1

I had this problem too, it was complaining about files with áéíóú characters in them. My temporay solution was to comment this line and suddenly it worked fine:

# info.filename = info.filename.encode("cp437").decode("utf-8")

https://github.com/hfaran/slack-export-viewer/blob/0e829968ad3bf9411dc942f6344cdb3ac3dad9cc/slackviewer/archive.py#L74

I don't know why that encoding conversion is there in the first place though, I feel that it shouldn't be necessary and it creates problems for people that don't have English as their default language. 🤔

aalkz avatar Aug 09 '24 07:08 aalkz

same problem

jiangyi1985 avatar Aug 30 '24 04:08 jiangyi1985

Same problem, using an archive from slackdump on MacOS.

conjon42 avatar Oct 05 '24 10:10 conjon42

+1. From slackdump on MacOS installed with homebrew.

laviddichterman avatar Jan 21 '25 03:01 laviddichterman

I had this problem too, it was complaining about files with áéíóú characters in them. My temporay solution was to comment this line and suddenly it worked fine:

# info.filename = info.filename.encode("cp437").decode("utf-8")

slack-export-viewer/slackviewer/archive.py

Line 74 in 0e82996 info.filename = info.filename.encode("cp437").decode("utf-8")

I don't know why that encoding conversion is there in the first place though, I feel that it shouldn't be necessary and it creates problems for people that don't have English as their default language. 🤔

@aalkz that change was added in this PR: https://github.com/hfaran/slack-export-viewer/pull/166 - if other people can confirm that reverting this change unblocks them then this change can be reverted (or perhaps added as an option in the program to enable or not).

hfaran avatar Jan 21 '25 04:01 hfaran

I had this problem too, it was complaining about files with áéíóú characters in them. My temporay solution was to comment this line and suddenly it worked fine: # info.filename = info.filename.encode("cp437").decode("utf-8") slack-export-viewer/slackviewer/archive.py Line 74 in 0e82996 info.filename = info.filename.encode("cp437").decode("utf-8") I don't know why that encoding conversion is there in the first place though, I feel that it shouldn't be necessary and it creates problems for people that don't have English as their default language. 🤔

@aalkz that change was added in this PR: #166 - if other people can confirm that reverting this change unblocks them then this change can be reverted (or perhaps added as an option in the program to enable or not).

I had the same problem, comment out the line unblocked me. However, it won't be able to show the direct messages. and I don't know if this is related to this line of code or a unrelated known problem.

Souukou avatar Jan 25 '25 06:01 Souukou

it won't be able to show the direct messages

There is a PR to revert the old default: https://github.com/hfaran/slack-export-viewer/pull/218

In the mean time, with the latest version you can use the command line flag --show-dms

volker-fr avatar Jan 25 '25 21:01 volker-fr

in my case this was due to user names with accented characters like é etc. I unzipped the file, opened the users.json and replaced all accented characters there. Zipped it again and generating with --html-only worked again.

WarrenFaith avatar Feb 14 '25 22:02 WarrenFaith

I encountered the same issue with Korean as well. Commenting out the line at this location resolved the problem for me.

https://github.com/hfaran/slack-export-viewer/blob/3ea3e985b846ce0550d8302e891acaf140981121/slackviewer/archive.py#L77

nevertmr avatar Feb 25 '25 23:02 nevertmr

Can we give a folder rather than an archive? I had the same issues with chinese characters in a filename.

alexisfrjp avatar Mar 06 '25 12:03 alexisfrjp

@alexisfrjp Yes, you can unpack the zip archive yourself in an empty directory and point the viewer to the directory (as others have suggested above). One tip, though. On at least some versions of macOS the unzip that comes installed by default is buggy, and isn't able to handle filenames with non-ASCII characters. To work around that problem you can install Homebrew's unzip and use that instead of /usr/bin/unzip.

bkline avatar May 10 '25 17:05 bkline

Also, I can confirm that the viewer's code without the attempt to round-trip the paths through cp437 successfully extracts all the files with the correct paths, and diff -r dir1 dir2 comes out clean (no differences) where dir2 is produced by that fixed code in the viewer and dir2 is populated by a non-broken unzip.

if other people can confirm that reverting this change unblocks them then this change can be reverted ...

@hfaran - do you have enough confirmations yet?

bkline avatar May 10 '25 19:05 bkline

I would be happy to review a PR if someone would like to submit one to revert #166, based on the above comments

hfaran avatar May 15 '25 07:05 hfaran

Done. https://github.com/hfaran/slack-export-viewer/pull/228

bkline avatar May 15 '25 09:05 bkline