Forban
Forban copied to clipboard
German Umlauts break search
Hi, on a remote system, I have a file with a "ö". This causes to break the search on every connected forban.
Browsing works.
Error in forbarn_share_error.log:
--- will included soon ---
Any idea?
Matthias
[21/Nov/2012:08:02:55] HTTP Traceback (most recent call last): File "/opt/forban/lib/ext/cherrypy/_cprequest.py", line 656, in respond response.body = self.handler() File "/opt/forban/lib/ext/cherrypy/lib/encoding.py", line 188, in call self.body = self.oldhandler(_args, *_kwargs) File "/opt/forban/lib/ext/cherrypy/_cpdispatch.py", line 34, in call return self.callable(_self.args, *_self.kwargs) File "/opt/forban/bin/forban_share.py", line 246, in q html += """
[21/Nov/2012:08:02:55] HTTP Request Headers: REFERER: http://piratebox.lan:12555/ HOST: piratebox.lan:12555 CONNECTION: keep-alive CACHE-CONTROL: max-age=0 Remote-Addr: ::ffff:192.168.1.168 ACCEPT: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,/;q=0.5 ACCEPT-CHARSET: ISO-8859-1,utf-8;q=0.7,_;q=0.3 USER-AGENT: Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.3 (KHTML, like Gecko) Chrome/6.0.472.63 Safari/534.3 ACCEPT-LANGUAGE: en-US,en;q=0.8 ACCEPT-ENCODING: gzip,deflate,sdch [21/Nov/2012:08:03:01] HTTP Traceback (most recent call last): File "/opt/forban/lib/ext/cherrypy/_cprequest.py", line 656, in respond response.body = self.handler() File "/opt/forban/lib/ext/cherrypy/lib/encoding.py", line 188, in call self.body = self.oldhandler(_args, *_kwargs) File "/opt/forban/lib/ext/cherrypy/_cpdispatch.py", line 34, in call return self.callable(_self.args, **self.kwargs) File "/opt/forban/bin/forban_share.py", line 246, in q html += """
[21/Nov/2012:08:03:01] HTTP Request Headers: REFERER: http://piratebox.lan:12555/ HOST: piratebox.lan:12555 CONNECTION: keep-alive CACHE-CONTROL: max-age=0 Remote-Addr: ::ffff:192.168.1.168 ACCEPT: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,/;q=0.5 ACCEPT-CHARSET: ISO-8859-1,utf-8;q=0.7,_;q=0.3 USER-AGENT: Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.3 (KHTML, like Gecko) Chrome/6.0.472.63 Safari/534.3 ACCEPT-LANGUAGE: en-US,en;q=0.8 ACCEPT-ENCODING: gzip,deflate,sdch [21/Nov/2012:08:03:08] HTTP Traceback (most recent call last): File "/opt/forban/lib/ext/cherrypy/_cprequest.py", line 656, in respond response.body = self.handler() File "/opt/forban/lib/ext/cherrypy/lib/encoding.py", line 188, in call self.body = self.oldhandler(_args, *_kwargs) File "/opt/forban/lib/ext/cherrypy/_cpdispatch.py", line 34, in call return self.callable(_self.args, **self.kwargs) File "/opt/forban/bin/forban_share.py", line 246, in q html += """
[21/Nov/2012:08:03:08] HTTP Request Headers: REFERER: http://piratebox.lan:12555/ HOST: piratebox.lan:12555 CONNECTION: keep-alive CACHE-CONTROL: max-age=0 Remote-Addr: ::ffff:192.168.1.168 ACCEPT: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,/;q=0.5 ACCEPT-CHARSET: ISO-8859-1,utf-8;q=0.7,_;q=0.3 USER-AGENT: Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.3 (KHTML, like Gecko) Chrome/6.0.472.63 Safari/534.3 ACCEPT-LANGUAGE: en-US,en;q=0.8 ACCEPT-ENCODING: gzip,deflate,sdch [21/Nov/2012:08:03:45] HTTP Traceback (most recent call last): File "/opt/forban/lib/ext/cherrypy/_cprequest.py", line 656, in respond response.body = self.handler() File "/opt/forban/lib/ext/cherrypy/lib/encoding.py", line 188, in call self.body = self.oldhandler(_args, *_kwargs) File "/opt/forban/lib/ext/cherrypy/_cpdispatch.py", line 34, in call return self.callable(_self.args, **self.kwargs) File "/opt/forban/bin/forban_share.py", line 246, in q html += """
[21/Nov/2012:08:03:45] HTTP Request Headers: REFERER: http://piratebox.lan:12555/ HOST: piratebox.lan:12555 CONNECTION: keep-alive CACHE-CONTROL: max-age=0 Remote-Addr: ::ffff:192.168.1.168 ACCEPT: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,/;q=0.5 ACCEPT-CHARSET: ISO-8859-1,utf-8;q=0.7,*;q=0.3 USER-AGENT: Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.3 (KHTML, like Gecko) Chrome/6.0.472.63 Safari/534.3 ACCEPT-LANGUAGE: en-US,en;q=0.8 ACCEPT-ENCODING: gzip,deflate,sdch
btw: in the browse-list, the Umlaut looks like an utf character (double byte)
Hi Matthias,
I did a small test writing a file named "ö.txt" in the share directory.
http://127.0.0.1:12555/q/?v=%C3%B6
I didn't get the same exception. Could you start a Python on the server and check the default encoding?
import sys
print sys.getdefaultencoding()
Just to be sure.
root@rPt4WCYo:/# python
Python 2.7.3 (default, Nov 3 2012, 11:37:47)
[GCC 4.6.3 20120201 (prerelease)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
import sys print sys.getdefaultencoding()>>> ascii
I found the issue(s) but I'm currently struggling how to fix it properly, the issue is from the incoming value from the filename (encoded in UTF-8) but the default codec in Python (for forban_share - line 243->250) is usually ascii and then the filename is encoded back into b64 encoding library Python where UTF-8 is not appreciated...
I tested with some ".decode("utf-8").encode("latin-1")" but it doesn't work in a consistent among the Python version and especially regarding the site configuration of the encoding. If you have any ideas, let me know. I'll check some other ideas.
Is it possible to redefine the default encoding around decoding base64 and turn it back to ascii later? import sys; sys.setdefaultencoding('utf-8') Or what about reducing every filename (complete while hashing, searching and the whatever) to ascii?
I learned a few things in my System-Administration and Userhelp on IBM Websphere MQ about all this sh**** encoding stuff: You have to know which encoding enters your system and what you use inside (i.e. during modification). I think one problem maybe a filename on the disc, not encoded in utf but having special character in i.e. ISO...-15 .
The complete platform independend steps should be something like this:
- Get Filename
- Convert Filename to UTF (if it already is, this shouldn't change anything)
- encode to base64
- decode to string in UTF (assuming you can accept UTF encoding while decoding)
If the normal base64.decode can't handle this well, you may try this library for encode and decode: http://docs.python.org/2.7/library/binascii.html?highlight=binascii#binascii
In a short overview it looks like an "convert any byte-array to hex" functionality. This should work like the default base64 function... with the flaw you have to convert back to string again.
Thanks for the feedback.
That's exactly the step 4 that is an issue. The base64 modules of Python is also relying on the binascii module. I'll give another try.
Hi, just found out, that this issue breaks the "remote download" functionality. You are visting Forban on your box, click in the line ofanother Forban "browse" and then "get" you recieve a 404 error that /s/ is not available.
:( Matthias
Try this: Add the following lines to your app config.
tools.decode.on = True
tools.encode.on = True
tools.encode.encoding = "utf-8"
tools.decode.encoding = "utf-8"
via http://stackoverflow.com/a/4915497/359326
Yep, I tried sometime ago but the result is variable depending of the Python 2 version and the platform where it's running. I'll build a set of test case to see where the origin of the issue is. Thank you.