MCEdit-Unified icon indicating copy to clipboard operation
MCEdit-Unified copied to clipboard

Still can't support unicode path name

Open fhfuih opened this issue 9 years ago • 25 comments

MCEdit can't load up if its storage folder contains unicode letters. I guess this bug was reported in issue #419. But 1.4.0.1 release build (tested windows version) still crashes. The dev version is fine and has been fine since commit 02d7683d69c2b9a31b7c51d6482f26fb8dcde47c.

fhfuih avatar Aug 30 '15 14:08 fhfuih

Can you please show us the log?

naor2013 avatar Aug 30 '15 14:08 naor2013

Ok. Actually MCEdit console pop up and disappear in a flash, so the screenshot is a bit vague. default Note that the console recognize the folder name as “杞 欢”, but the real folder name is “软件”. That is like gibberish

fhfuih avatar Aug 30 '15 14:08 fhfuih

For anyone who can't see it, the error is: Error loading Python DLL: F:[2 chinese letters]\mcedit.v1.4.0.1.win.64bit\mcedit\python.dll (error code 126) I wonder if it's not a unicode thing, since the letters in the error and the folder name are different.

Khroki avatar Aug 30 '15 23:08 Khroki

https://github.com/pyinstaller/pyinstaller/commit/ab5ab6fdcf160e0c5ecb341d33c9730faed67a37

codewarrior0 avatar Aug 30 '15 23:08 codewarrior0

Hmm, that's python 3 though.

Khroki avatar Aug 30 '15 23:08 Khroki

Hmm, not sure the fix but for now put mcedit in a directory with only western characters (drive names are fine) and it should run.

Khroki avatar Sep 01 '15 00:09 Khroki

The error 126 means that a DLL can't be found. In this case, I guess it is beacause of the unicode chars in the path. It is a know bug with pyinstaller. As pointed earlier, it has been fixed for Python 3, but not for Python 2...

LaChal avatar Sep 02 '15 20:09 LaChal

So it means we can't kill this bug now? :/

fhfuih avatar Sep 03 '15 07:09 fhfuih

The fact is that Python (like a lot of other stuff) does not handle well encodings...

We're looking close to these encoding issues, but, except workarounds, we can't provide anything else for now...

LaChal avatar Sep 03 '15 21:09 LaChal

This may be fixed in an upcoming version, We may upgrade pyinstaller soon. Will keep you posted.

Khroki avatar Jan 05 '16 05:01 Khroki

Can you try this build and let us know if it works? https://www.dropbox.com/s/diblg860juepoil/mcedit%201.5.x.x%20test.exe?dl=0 Note, does not contain all the fixes in the actual 1.5.1.0 version.

Khroki avatar Jan 08 '16 02:01 Khroki

@Khroki Err...Not even in the ASCII path, returning this in the console

No module named shell
Running in fixed nide. Support files are in your Docunents folder.
Splash load...
Traceback (most recent call last):
File “<string>”, line 13, in <module>
File “C:\build\pyinstaller—develop\Pylnstaller\loader\pyimodO3_iirporters. py”, line 363, in loadjiodule
File “C:\Users\Kris\Desktop\Tal’ s Folder\Minecraft\ICEdit\appdata\mcedit-maste r\splash.py”, line 22, in <nodule>
pygane. error: Couldn’t open C:\build\bin\dist\mcedit\splashes\splashl. png mcedit returned —l

In Unicode path, it just stops running, no response. Only the console window pops up saying nothing.

fhfuih avatar Jan 09 '16 09:01 fhfuih

Hmm, there's definitely a mapping issue going on, look at all those incorrect letters and added spaces. What locale is your system? As we know it works under Korean systems this seems odd. That build was with a new dev version of pyinstaller after their unicode fixes, although they may have missed something.

Khroki avatar Jan 15 '16 16:01 Khroki

If you typed that manually, then it may be an issue with our splash system, but I thought we had a fallback for that now. Also are you just doubleclicking the file or attempting to run it from another directory?

Khroki avatar Jan 15 '16 16:01 Khroki

Did you move mcedit after having used it?

If yes, delete the file splash which is in MCEdit folder.

LaChal avatar Jan 15 '16 19:01 LaChal

@Khroki @LaChal My system is Simp.Chinese 64-bit Windows 10. I just extract the zip and double-click on mcedit.exe, no additional moving files or anything besides that. Files stored in F:\add-and-delete-unicode-of-this-folder-name\mcedit\mecdit.exe. Plus, I've set every version of MCEdit in my PC to portable mode. So I think there's not any file conflict.

After deleting splash it works in some way. There's some wierd stuff happening. Problems below only appears when the path contains Chinese characters.

  • Once I delete splash and first run it, it shows this (with a splash image) and crashes. default Just run it for the second time it will turn good. And it keeps good until I delete splash again. Every deleting operation cause this loop to happen again.
  • I can't set it to portable mode. unicode error occurs in moving files. default 1 But if I run it in ASCII path and successfully set to portable, then add Unicode in path name, run it again and set to fixed, Same log is returned but those files are still move to Document. Then I change back to portable, those files cannot be moved back.
  • It corrupts when the folder name contains both Chinese and whitespace. Letters are okay. Don't know if any other characters are concerned. 1 Most times it shows those 2 lines(sometimes it shows nothing) and a Windows system pop-up appears to tell you MCEdit is not responsing. In this case it even can't generate splash. This is also why I can't run it a week ago. I changed the folder name and find this bug.

Above are the bugs found by now,

fhfuih avatar Jan 16 '16 10:01 fhfuih

The next release will include a fix for the splash issue. Meanwhile, open splash and delete its content. You'll see only the default splash screen, not the other ones.

Concerning the issue when changing the install type, we need to work (again) on this :smile:

By the way, did you test the last release(1.5.1.0)?

LaChal avatar Jan 16 '16 19:01 LaChal

No I haven't tested 1.5.1.0 yet. :)

fhfuih avatar Jan 17 '16 06:01 fhfuih

We think we finally fixed this under windows with 1.5.3.0. If possible could you test it?

Khroki avatar May 27 '16 04:05 Khroki

Well... To me the problem still exists in 1.5.3.0... This time the console will pop up but not showing any text and then disappear.

On the first running after extraction though, there seems to be something on the console, but it's too hard to print screen just on time :sweat: (plus there must be some cache somewhere because when I delete Documents/mcedit and mcedit main folder and re-extract the software, that "something" won't show up.)

fhfuih avatar May 28 '16 09:05 fhfuih

Open a console and go to the folder you installed MCEdit. Then enter mcedit.exe.

LaChal avatar May 28 '16 11:05 LaChal

Oh yeah, there you go default

fhfuih avatar May 28 '16 11:05 fhfuih

Odd, I can run it fine from Chinese, Japanese, and Russian directories. Not sure what's gone wrong here. untitled I'll try changing my system language later today and see if that makes it happen.

Khroki avatar May 28 '16 12:05 Khroki

At least there's traceback and no more python DLL error. :smile:

fhfuih avatar May 29 '16 07:05 fhfuih

Odd, I can run it fine from Chinese, Japanese, and Russian directories. Not sure what's gone wrong here.

This specific case has to do with the __file__ attribute of a Python file. In Python 2.7, __file__ is always a str type and never unicode, thus on Windows it will always be codepage-encoded, and cannot represent all possible filenames. If you run MCEdit using the Python interpreter from a source checkout that is in a folder with Russian characters, while your current codepage ("Language for Non-Unicode Programs" in the Language And Region control panel) is not Russian, then __file__ will be an unusable filename containing question marks in place of the non-representable characters.

Because PyInstaller keeps its Python files in an archive, it synthesizes the __file__ attribute to point to the equivalent location in the built app's folder. To make it possible to run the app from a folder whose name is non-representable, PyInstaller also changes __file__ to a path that is changed to an 8.3 ShortFileName. In addition to shortening filenames to DOCUME~1 and the like, it will also replace any characters not encodable in the current codepage with their numeric codes. This is the reason for the workaround, and why @khroki is able to run the built app from a folder containing Russian characters - he's not using a Russian codepage, so they get replaced with numeric codes. If you check the log file when running from that folder, you should see that the Russian characters are replaced with these numeric codes.

On @fhfuih's computer, the PyInstaller-provided __file__ attributes are still shortened, but because his codepage is set to Chinese and the filename contains Chinese characters, then the __file__ attribute will contain Chinese characters, which cannot be decoded with the ascii codec as the error indicates. However, they can be decoded with the filesystem encoding.

But that's just a technical detail about why it works for @Khroki but not for @fhfuih.

The core problem is that you are mixing str and unicode when dealing with filenames. (In this specific case you are joining some_module.__file__ with u"ini", which is mixing a str with a unicode.) The only way to solve it is to only use one type for filenames, either str or unicode.

If you choose str, then you won't be able to work with any files whose names can't be encoded in the current codepage. Additionally, if you get any unicode type filenames (e.g. from user input, file dialogs, and such) then you will need to encode them with sys.getfilesystemencoding() before using them. You will also have to deal with the UnicodeEncodeError that will result if said filename can't be encoded in the current codepage.

If you choose unicode, then again, if you get any str type filenames (e.g. from a module's __file__ attribute) then you will need to decode them using sys.getfilesystemencoding() - or better yet, obtain them as unicode in the first place if possible. unicode is the better option as it can represent all possible filenames.

codewarrior0 avatar May 29 '16 09:05 codewarrior0