opensmile-python icon indicating copy to clipboard operation
opensmile-python copied to clipboard

UnicodeEncodeError

Open felixbur opened this issue 2 years ago • 8 comments

Getting an lib/python3.9/site-packages/opensmile/core/SMILEapi.py", line 237, in map(lambda v: bytes(str(v), "ascii"), sum(options.items(), ()))) UnicodeEncodeError: 'ascii' codec can't encode character '\u0308' in position 58: ordinal not in range(128)

Happened with python3.9 under Mac OS

To solve the error, specify the correct encoding, e.g. utf-8

felixbur avatar Nov 18 '22 08:11 felixbur

At the moment we test only for older Python vesions: image

Maybe we should also update this.

hagenw avatar Nov 18 '22 08:11 hagenw

This problem is expected to occur if you pass arguments to openSMILE with non-ASCII characters, e.g. file paths containing special characters. openSMILE does not have official support for UTF-8 in config files, at least we have not tested what happens when you use UTF-8. If you're lucky, it might work on some platforms out-of-the-box but this is something we cannot guarantee. So the recommendation from my side would be to ensure there are no special characters in file paths and openSMILE options. And in config files, there should never be the need for special characters anyway.

chausner-audeering avatar Nov 18 '22 08:11 chausner-audeering

openSMILE does not have official support for UTF-8 in config files, at least we have not tested what happens when you use UTF-8

Should we still switch to UTF-8 encoding in SMILEapi.py, as it seems to solve the issue at least in some cases?

frankenjoe avatar Nov 18 '22 09:11 frankenjoe

I wouldn't because it will be harder to debug issues due to it when the error occurs at another point with a possibly unrelated error message. In the best case, you would get the error that a file couldn't be found and you might figure out it's due to special characters in the path.

chausner-audeering avatar Nov 18 '22 09:11 chausner-audeering

If you're lucky, it might work on some platforms out-of-the-box but this is something we cannot guarantee

This sounds not like a nice behavior to me, so maybe we should just raise an error if an argument (whatever argument means here ;) ) contains non-ASCII?

hagenw avatar Nov 18 '22 09:11 hagenw

Or maybe fix openSMILE to support non-ASCII. We are in the year 2022 :)

frankenjoe avatar Nov 18 '22 09:11 frankenjoe

This sounds not like a nice behavior to me, so maybe we should just raise an error if an argument (whatever argument means here ;) ) contains non-ASCII?

Yes, this is what happens at the moment. Any inputs with non-ASCII characters will throw the exception reported by @felixbur. I was referring to if we just change "ascii" to "utf-8", then you might be lucky that it will work but we don't know for sure without checking the code.

chausner-audeering avatar Nov 18 '22 10:11 chausner-audeering

This sounds not like a nice behavior to me, so maybe we should just raise an error if an argument (whatever argument means here ;) ) contains non-ASCII?

Yes, this is what happens at the moment. Any inputs with non-ASCII characters will throw the exception reported by @felixbur. I was referring to if we just change "ascii" to "utf-8", then you might be lucky that it will work but we don't know for sure without checking the code.

Ah, ok, but then we should update the error message. At least I'm not able to understand what is going wrong when seeing:

map(lambda v: bytes(str(v), "ascii"), sum(options.items(), ())))

hagenw avatar Nov 18 '22 11:11 hagenw