pathagar icon indicating copy to clipboard operation
pathagar copied to clipboard

Exceptions with books containing UTF8 characters

Open SR-G opened this issue 12 years ago • 5 comments

I'm running Pathagar on a up-to-date Ubuntu server. Have just checked out the pathagar source code and set it up. Everything is working fine ... except when i have books containing accentuated characters in UTF-8 ("é", "à", and so on). Images for those ones are not displayed and those books are not downloadables : "'ascii' codec can't encode character u'\xf9' in position 80: ordinal not in range(128)"

Request Method: GET
Request URL:    http://192.168.1.4:8000/book/3062/download
Django Version: 1.3.1
Exception Type: UnicodeEncodeError
Exception Value:    
'ascii' codec can't encode character u'\xf9' in position 80: ordinal not in range(128)
Exception Location: /usr/lib/python2.7/genericpath.py in exists, line 18
Python Executable:  /usr/bin/python
Python Version: 2.7.3
Python Path:    
['/home/applications/pathagar',
 '/usr/lib/python2.7',
 '/usr/lib/python2.7/plat-linux2',
 '/usr/lib/python2.7/lib-tk',
 '/usr/lib/python2.7/lib-old',
 '/usr/lib/python2.7/lib-dynload',
 '/usr/local/lib/python2.7/dist-packages',
 '/usr/lib/python2.7/dist-packages',
 '/usr/lib/python2.7/dist-packages/gtk-2.0',
 '/usr/lib/pymodules/python2.7']

I really don't know anything about Python so i'm really not able to correct this behavior on my side. I even don't know if the problem is system-side, python-side, django-side, pathagar-side ... Any ideas ?

SR-G avatar Apr 15 '13 21:04 SR-G

Ok after a lot of tryes, the only working trick for the now seem to be to modify "views.py" in order to have in the download_book method : book.save() filename = filename.encode('utf-8') return sendfile(request, filename, attachment=True)

SR-G avatar Apr 15 '13 22:04 SR-G

You should also be able to do:

filename = unicode(os.path.join(settings.MEDIA_ROOT, book.book_file.name))

But I will have to test that to make sure it works. If you added the book via a mass-import, could you send me the collection you imported. Or could you give me the filename of the book with the unicode character so I can test?

sethwoodworth avatar Apr 15 '13 23:04 sethwoodworth

Hello ! Thanks for your answer. About the sendfile encoding problem : here is an example. French string = "Homère-L'iliade et l'odyssée (illustrés).epub"

Debug string on error (have reverted my change to get have it back) = u'/home/applications/pathagar/../pathagar/static_media/books/Hom\xe8re-Liliade_et_lodyss\xe9e_illustr\xe9s.epub' (=> 'ascii' codec can't encode character u'\xe8' in position 62: ordinal not in range(128))

The filename on disk is of course OK (my linux server is fully utf-8), including the copy done in the static_media pathagar folder : /home/applications/pathagar/static_media/books% ls -1 Hom* Homère-Liliade_et_lodyssée_illustrés.epub

By the way, even if i'm less worried about this problem, there is exactly the same problematic behavior with covers (some covers are broken due to accentuated characters in the path of the picture) (i often have a .jpg in the subfolder containing the .epub, and i automatically add this .jpg with the mass-import facility). For that issue, not knowing anything about Django, i really don't see where/how to handle that (i don't see how to easily alter the behavior in urls.py or maybe in opds.py)

SR-G avatar Apr 16 '13 16:04 SR-G

Thanks, I will test this tonight.

About the cover images, I'm not the original author, so I am not always aware of known issues. Please open up a new issue with your description of the problem and any additional information and I will take a look at that as well.

sethwoodworth avatar Apr 16 '13 17:04 sethwoodworth

I have tested this issue, and works without problems. Can we close this issue?

godiard avatar Apr 10 '15 20:04 godiard