stemroller icon indicating copy to clipboard operation
stemroller copied to clipboard

Processing local file fails if the path contains certain unicode characters

Open Kimbatt opened this issue 2 years ago • 3 comments

If I try to load a local file, which has the character ű anywhere in its path, then the splitting process fails. It seems like characters starting from U+0100 (character code 256) are causing the problem: Ā (U+0100), ā (U+0101), etc. doesn't work, but ÿ (U+00FF) and þ (U+00FE) works. Tested on windows, with the following paths: X:\ű\test.mp3, and X:\ű.mp3

Console output:

[electron] BEGIN downloading/processing video "a327034531660779" - "test"
[electron] Splitting video "a327034531660779"; 4 jobs using model "htdemucs_ft"...
[electron] Running with "-d cpu" to force CPU instead of CUDA
[electron] child stderr:
[electron] C:\Development\demucs-cxfreeze-2\venv\lib\site-packages\torch\_jit_internal.py:839: UserWarning: Unable to retrieve source for @torch.jit._overload function: <function upsample at 0x000001877F20E9D0>.
...
many more warnings
...
[electron]
[electron] child stderr:
[electron] C:\Development\demucs-cxfreeze-2\venv\lib\site-packages\torch\_jit_internal.py:839: UserWarning: Unable to retrieve source for @torch.jit._overload function: <function norm at 0x000001877F401310>.
[electron]
[electron] child stderr:
[electron] Traceback (most recent call last):
[electron]   File "C:\Development\demucs-cxfreeze-2\venv\Lib\site-packages\cx_Freeze\initscripts\__startup__.py", line 138, in run
[electron]
[electron] child stderr:
[electron]   File "C:\Development\demucs-cxfreeze-2\venv\Lib\site-packages\cx_Freeze\initscripts\console.py", line 16, in run
[electron]
[electron] child stderr:
[electron]   File "main.py", line 4, in <module>
[electron]
[electron] child stderr:
[electron]   File "C:\Development\demucs-cxfreeze-2\venv\lib\site-packages\demucs\separate.py", line 158, in main
[electron]
[electron] child stderr:
[electron]   File "C:\Users\Nunya\AppData\Local\Programs\Python\Python39\lib\encodings\cp1252.py", line 19, in encode
[electron]
[electron] child stderr:
[electron] UnicodeEncodeError: 'charmap' codec can't encode character '\u0171' in position 20: character maps to <undefined>
[electron]
[electron] child stdout:
[electron] Selected model is a bag of 4 models. You will see that many progress bars per track.
[electron] Separated tracks will be stored in X:\temp\StemRoller-ukGKgl\separated\htdemucs_ft
[electron]
[electron] Trace: Error: Unable to find Demucs output directory
[electron]     at findDemucsOutputDir (X:\stemroller\main-src\processQueue.cjs:156:9)
[electron]     at async _processVideo (X:\stemroller\main-src\processQueue.cjs:212:26)
[electron]     at async processVideo (X:\stemroller\main-src\processQueue.cjs:286:5)
[electron]     at processVideo (X:\stemroller\main-src\processQueue.cjs:288:13)

The relevant part is probably this line: UnicodeEncodeError: 'charmap' codec can't encode character '\u0171' in position 20: character maps to <undefined>

Kimbatt avatar Jan 05 '23 12:01 Kimbatt

Good to know, thanks for the detailed report! Not sure how to fix this right now but seems like a pretty serious bug; hopefully I'll get a chance to look into it sometime.

iffyloop avatar Jan 05 '23 22:01 iffyloop

I'm able to reproduce this on macOS as well. Processing a file in a folder named Ólafur Arnalds fails. Moving the file outside of that offending folder fixes the issue. Would be nice to at least show the user a better error message if the fix is not straight-forward. Thanks for your hard work on this app!

t1merickson avatar Apr 13 '23 11:04 t1merickson

I'm not sure it is the same issue, but I found an issue related to locale of a child process. To reproduct this on mac with CLI :

$ pwd
/Users/soyu/git/stemroller
$ echo $LANG
ko_KR.UTF-8
$ PATH=$PWD/mac-extra-files/ThirdPartyApps/demucs-cxfreeze:$PWD/mac-extra-files/ThirdPartyApps/ffmpeg/bin demucs-cxfreeze 한.mp3 -n htdemucs_ft -j 1 --repo $PWD/anyos-extra-files/Models
# ... no UnicodeEncodeError
$ export LANG=""
$ PATH=$PWD/mac-extra-files/ThirdPartyApps/demucs-cxfreeze:$PWD/mac-extra-files/ThirdPartyApps/ffmpeg/bin demucs-cxfreeze 한.mp3 -n htdemucs_ft -j 1 --repo $PWD/anyos-extra-files/Models
...
UnicodeEncodeError: 'ascii' codec can't encode character '\ud55c' in position 17: ordinal not in range(128)

To workaround this, you could add next line to processQueue.cjs Line 68 :

const CHILD_PROCESS_ENV = {
  CUDA_PATH: process.env.CUDA_PATH,
  PATH: process.env.PATH,
  TEMP: process.env.TEMP,
  TMP: process.env.TMP,
  LANG: "ko_KR.UTF-8" // added system locale
}

Another workaround is modifying L134 instead :

    curChildProcess = childProcess.spawn(command, args, {
      cwd,
      env: {
        ...process.env,
        ...CHILD_PROCESS_ENV
      },
    })

But it will not work when it is launched with the app icon on mac. I think that the right direction is to use app.getLocale() or something like that when spawning child processes.

mmx900 avatar Jan 02 '24 12:01 mmx900

Thanks @mmx900 - your fix was added in #59 but waiting for the build to complete so we can test launching it from the app icon. I tried using app.getSystemLocale instead of app.getLocale. Also skeptical that this probably won't fix this issue on Windows, but I need to test later. Will leave the issue open until multiple users can confirm it's fixed.

iffyloop avatar Apr 08 '24 08:04 iffyloop

Closing since nobody has commented about this issue recently. Please reopen if you run into the same error. Thanks mmx900 for your suggestion - it seems to have solved the issue.

iffyloop avatar May 03 '24 23:05 iffyloop