scour icon indicating copy to clipboard operation
scour copied to clipboard

Cyrillic characters in filename not suported

Open JoKalliauer opened this issue 6 years ago • 12 comments

Processing a file with cyrillic chracters in the filename with:

$ scour Система_виробничих_відносин.svg output.svg

lead to an errormessage: IOError: [Errno 22] invalid mode ('rb') or filename: '???????_??????????_????????.svg'

Traceback (most recent call last):
  File "C:\Python27\Scripts\scour-script.py", line 11, in <module>
    load_entry_point('scour==0.36', 'console_scripts', 'scour')()
  File "c:\python27\lib\site-packages\scour\scour.py", line 3919, in run
    (input, output) = getInOut(options)
  File "c:\python27\lib\site-packages\scour\scour.py", line 3833, in getInOut
    infile = maybe_gziped_file(options.infilename, "rb")
  File "c:\python27\lib\site-packages\scour\scour.py", line 3828, in maybe_gziped_file
    return open(filename, mode)
IOError: [Errno 22] invalid mode ('rb') or filename: '???????_??????????_????????.svg'

JoKalliauer avatar Dec 26 '17 18:12 JoKalliauer

Hi,

I cannot reproduce this in git on master (nor at the v0.36 tag):

$ time PYTHONPATH=. python3 -m scour.scour -i Система_виробничих_відносин.svg -o output.svg
Scour processed file "Система_виробничих_відносин.svg" in 371 ms: 29940/58953 bytes new/orig -> 50.8%

@JoKalliauer: What version of scour were you using when you got this error?

nthykier avatar Feb 17 '18 18:02 nthykier

FTR, my OS locale defaults to an UTF-8 encoding and it also works flawlessly with my python 2.7 installtion.

nthykier avatar Feb 17 '18 18:02 nthykier

Thanks I know now the mistake, but changeing the default encoding in the registry of Windows 10 can damage the OS (not bootable any more).

  1. I read https://stackoverflow.com/a/17177904/6747994
  2. then I run chcp and got Aktive Codepage: 850., which is a encoding used in DOS in Western Europe, more infos see: https://en.wikipedia.org/wiki/Code_page_850
  3. Reading https://superuser.com/questions/269818/change-default-code-page-of-windows-console-to-utf-8 said that I have to change the registrykey HKEY_LOCAL_MACHINE\Software\Microsoft\Command Processor\Autorun to @chcp 65001>nul and would solve the problem, but in the sentence above the user wrote: Changing [registrykey] value to 65001 appear to make the system unable to boot in my case.

Therefore I don't want to change the default encoding in the registry of Windows 10 to UTF-8 just because of scour. svgo and svgcleaner don't have any problems with UTF-8 characters in the filename on my Windows-10-OS, but scour does.

@nthykier : I use scour 0.36

JoKalliauer avatar Feb 17 '18 19:02 JoKalliauer

The problem seems to be the executable python (or pip?) creates.

I'm on Windows 10 myself and if I run

python -m scour.scour zСистема.svg zСистема2.svg

it works flawlessly.

Ede123 avatar Feb 17 '18 19:02 Ede123

@nthykier

time PYTHONPATH=. python3 -m scour.scour -i Система_виробничих_відносин.svg -o output.svg

i get

/usr/bin/python3: Error while finding module specification for 'scour.scour' (ModuleNotFoundError: No module named 'scour')

real 0m0,149s user 0m0,062s sys 0m0,046s

How can I solve this issue?

@Ede123 When I run

python -m scour.scour zСистема.svg zСистема2.svg

i get /usr/bin/python: No module named scour, how can I solve this issue?

According to https://stackoverflow.com/a/2326045/6747994 and https://stackoverflow.com/a/7587545/6747994 I might have to add the path of scour to sys.path or $PYTHONPATH? (Maybe you can help me?)

In cygwin.exe as well as in cmd.exe I get:

which scour /cygdrive/c/Python27/Scripts/scour

JoKalliauer avatar Feb 17 '18 20:02 JoKalliauer

I haven't used cygwin in a long time, sorry.

Just noticed I can not make it work in Python 2 either, so maybe it's a general issue in Python 2 (Python 3 works when loading the module directly for me).

You can also try to use short filenames (they usually work when everything else fails).

Ede123 avatar Feb 17 '18 20:02 Ede123

@Ede123

  • I installed Python 3.6
  • run pip3 install scour
  • using which scour leads to /cygdrive/c/Program Files (x86)/Python36-32/Scripts/scour
  • running the command scour -i zСистема.svg -o zСистема2.svg leads to:

Traceback (most recent call last): File "C:\Program Files (x86)\Python36-32\Scripts\scour-script.py", line 11, in load_entry_point('scour==0.36', 'console_scripts', 'scour')() File "c:\program files (x86)\python36-32\lib\site-packages\scour\scour.py", line 3919, in run (input, output) = getInOut(options) File "c:\program files (x86)\python36-32\lib\site-packages\scour\scour.py", line 3833, in getInOut infile = maybe_gziped_file(options.infilename, "rb") File "c:\program files (x86)\python36-32\lib\site-packages\scour\scour.py", line 3828, in maybe_gziped_file return open(filename, mode) OSError: [Errno 22] Invalid argument: 'z???????.svg'

Therefore I don't think it is a Python-2-bug.

JoKalliauer avatar Feb 17 '18 20:02 JoKalliauer

Please read properly: https://github.com/scour-project/scour/issues/158#issuecomment-366466536

Ede123 avatar Feb 17 '18 20:02 Ede123

@Ede123 I think I don't understand your comment.

scour.exe does not work properly, but the source code does?

If I would use the code python -m scour.scour zСистема.svg zСистема2.svg then there shouldn't be any encoding problem?

But if i run python -m scour.scour zСистема.svg zСистема2.svg on my computer I get the error: /usr/bin/python: No module named scour. I'm not very familiar to programming and I don't have a glue what python is, therfore I don't know how to solve this issue.

I googed and found:

  • https://stackoverflow.com/a/339220/6747994
  • https://stackoverflow.com/a/2326045/6747994
  • https://stackoverflow.com/a/7587545/6747994

but I do not understand them, and I don't know what(and how) to append to sys.path.

JoKalliauer avatar Feb 17 '18 21:02 JoKalliauer

scour.exe does not work properly, but the source code does?

The executable is only a wrapper automatically created by pip - it seems the wrapper causes additional problems when dealing with non-ASCII characters (and is broken for both, Python 2 and Python 3.

If I would use the code python -m scour.scour zСистема.svg zСистема2.svg then there shouldn't be any encoding problem?

It rules out problems of the wrapper and leaves you with problems of python itself - in Python 3 it seems to work, in Python 2 it seems to be broken.

But if i run python -m scour.scour zСистема.svg zСистема2.svg on my computer I get the error: /usr/bin/python: No module named scour.

As suggested before try not to run it in a Cygwin shell. If you have CPython installed from python.org it should work (in cmd.exe). The Cygwin environment will likely create even more problems.

Ede123 avatar Feb 17 '18 21:02 Ede123

@Ede123 Thanks: python -m scour.scour zСистема.svg zСистема2.svg works with Python3 (not in Python2) in cmd.exe

Since Python2.7 is still default, I'm not sure if this issue can/should be closed.

JoKalliauer avatar Feb 17 '18 22:02 JoKalliauer

Well, as I wrote it's basically a Python issue... See https://bugs.python.org/issue2128 for some older information on this.

I'm not sure if it even could be worked around from our side and I also feel there are more important things...

Ede123 avatar Feb 17 '18 22:02 Ede123