py-leveldb-windows
py-leveldb-windows copied to clipboard
Is it support unicode paths?
Can I create db in path like: c:\tmp\中文-español\
?
I am not sure. But if the official leveldb suport, there is no reason this code can't.
Unfortunately, official leveldb not support Windows :(
Linux is the same, can the linux version support /usr/xxx/中文目录/
?
Default encoding in Linux is UTF-8 and this is unicode and there is no problem, but in Window it is Win1251, for example. So, from C code you must make some conversions to support unicode in windows.
When we call this:
leveldb_open(const leveldb_options_t* options, const char* name, char** errptr);
by default, in name
we put path to db in ANSI encoding in Windows and in UTF-8 in Linux. And we can't access to path in not-system encoding in Windows. To access such paths in Windows we should put in name
UTF-8 too, but Windows port of leveldb must expect this and convert UTF-8 to UTF-16 and call Unicode functions from windows api (CreateFileW instead of CreateFileA).
So, is your port of leveldb work with UTF-8 or default encoding?
I am not quite sure. I am busy with a conference deadline now. You may check it by yourself.
Can you give me precompiled *.pyd for x86 Python?
I don't have x86 python.
I have update the Win32
configuration of the project. You can compile it by yourself.
I can install Python x64 for testing in this case. Because installing Visual Studio and compile lib from sources is more difficult. So, give me your pyd for x64, please?
You can download the x64 leveldb.pyd at http://pan.baidu.com/s/1pJ1mMnx .
Test code:
#!/usr/bin/python
# -*- coding: utf-8 -*-
import codecs
import leveldb
db_path_uni = u'c:\\tmp\\中文-español'
with codecs.open('leveldb_uni_test.txt', 'w', encoding='utf-8') as f:
f.write(db_path_uni)
db = leveldb.LevelDB(db_path_uni)
db.Put('hello', 'hello world')
print db.Get('hello')
failed with message:
Traceback (most recent call last):
File "C:\Python27\leveldb_uni_test.py", line 12, in <module>
db = leveldb.LevelDB(db_path_uni)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 7-8: ordinal not in range(128)
If I convert unicode to utf-8 and try to open db:
db = leveldb.LevelDB(db_path_uni.encode('utf-8'))
than it works, BUT it create a new directory c:\tmp\дёж–‡-espaГ±ol
that is not a unicode path, this is path with a garbage text in my windows default encoding - win1251.
In summary, this port is not work with unicode paths :(
What do you think and can you fix it?
I will try to fix this problem after paper deadline 11/6.