pyfilesystem2 icon indicating copy to clipboard operation
pyfilesystem2 copied to clipboard

Support encoding option for ftpfs

Open frafra opened this issue 3 years ago • 5 comments

I am fetching data from a Windows FTP server, which contains some special characters.

Traceback (most recent call last):
  File "/home/frafra/.cache/pypoetry/virtualenvs/pyfilesystem-sync-qQEmY_5I-py3.8/lib/python3.8/site-packages/fs/errors.py", line 125, in new_func
    return func(*args, **kwargs)
  File "/home/frafra/.cache/pypoetry/virtualenvs/pyfilesystem-sync-qQEmY_5I-py3.8/lib/python3.8/site-packages/fs/opener/ftpfs.py", line 56, in open_fs
    return ftp_fs.opendir(dir_path, factory=ClosingSubFS)
  File "/home/frafra/.cache/pypoetry/virtualenvs/pyfilesystem-sync-qQEmY_5I-py3.8/lib/python3.8/site-packages/fs/base.py", line 1247, in opendir
    if not self.getinfo(path).is_dir:
  File "/home/frafra/.cache/pypoetry/virtualenvs/pyfilesystem-sync-qQEmY_5I-py3.8/lib/python3.8/site-packages/fs/ftpfs.py", line 682, in getinfo
    directory = self._read_dir(dir_name)
  File "/home/frafra/.cache/pypoetry/virtualenvs/pyfilesystem-sync-qQEmY_5I-py3.8/lib/python3.8/site-packages/fs/ftpfs.py", line 559, in _read_dir
    self.ftp.retrlines(
  File "/usr/lib64/python3.8/ftplib.py", line 461, in retrlines
    line = fp.readline(self.maxline + 1)
  File "/usr/lib64/python3.8/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 49: invalid continuation byte

ipdb session:

ipdb> data
b'10-01-2021  11:00PM       <DIR>          Bilder V\xe4stra G\xf6taland\r\n10-06-2021  10:03AM       <DIR>          SeNorge\r\n'
ipdb> data.decode('utf8')
*** UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe4 in position 49: invalid continuation byte
ipdb> data.decode('windows-1252')
'10-01-2021  11:00PM       <DIR>          Bilder Västra Götaland\r\n10-06-2021  10:03AM       <DIR>          SeNorge\r\n'

Python built-in ftplib can use a different encoding: https://docs.python.org/3/library/ftplib.html#ftplib.FTP

class ftplib.FTP(host='', user='', passwd='', acct='', timeout=None, source_address=None, *, encoding='utf-8')¶

ftpfs does not take "encoding" as parameter:

https://github.com/PyFilesystem/pyfilesystem2/blob/baa05606487d7aad2b7be5dd42a33276d463e4d1/fs/opener/ftpfs.py#L44-L52 https://github.com/PyFilesystem/pyfilesystem2/blob/baa05606487d7aad2b7be5dd42a33276d463e4d1/fs/ftpfs.py#L399-L409

I propose to accept encoding as an optional parameter, which should then passed to the FTP constructor.

It would then be possible to connect to resources like: ftp://user:password@ftpserver/path?encoding=windows-1252

frafra avatar Oct 06 '21 11:10 frafra

There is some reference to encodings, but it seems that only utf-8 or latin-1 are handled: https://github.com/PyFilesystem/pyfilesystem2/blob/baa05606487d7aad2b7be5dd42a33276d463e4d1/fs/ftpfs.py#L492-L501

ftpfs should not override an encoding provided by the user, probably

If I set this variable to windows-1252 the software works: https://github.com/PyFilesystem/pyfilesystem2/blob/baa05606487d7aad2b7be5dd42a33276d463e4d1/fs/ftpfs.py#L501

frafra avatar Oct 06 '21 11:10 frafra

same problem here, although i'm working with a Linux FTP Server, but my root still contains folders with special characters, thus I can't even do a listdir() or anything with those unless this gets fixed. Thanks :).

Timtam avatar Apr 26 '22 07:04 Timtam

@Timtam do you know what is the encoding used by your FTP server?

frafra avatar Apr 26 '22 07:04 frafra

Nope, unfortunately not. I tried checking the interface of my Synology NAS, it says "UTF-8 automatic" (whatever this means), I tried debugging a FileZilla connection and found out that FileZilla will send "OPTS UTF8 ON" on first connection, but I didn't find out what the default encoding might be.

Timtam avatar Apr 26 '22 08:04 Timtam

It could be an alternative solution to send to the server OPTS UTF8 ON if pyfilesystem2 is not able to cope with different encodings.

frafra avatar Apr 26 '22 10:04 frafra