smart_open icon indicating copy to clipboard operation
smart_open copied to clipboard

Support for reading and writing files directly to/from ftp

Open ziky90 opened this issue 9 years ago • 7 comments

Another feature that I would like to have in smart_open is support for direct reading/writing of the data to/from ftp.

Anyone else interested in this? Does this make sense?

ziky90 avatar Sep 16 '15 15:09 ziky90

I thought about making a scp interface in the worst possible way, (make a shell command).

val314159 avatar Sep 16 '15 15:09 val314159

I'd be interested in seeing the implementation for this. I've used pysftp in the past for such tasks and it is very slow for large files (you're better of with sftp)

trentgerman avatar Aug 15 '18 18:08 trentgerman

Is this issue still relevant today? I'd like to work on that if that would be the case :)

saschalang32 avatar Jan 27 '21 11:01 saschalang32

+1 ftplib currently has function retrbinary() where I currently download from FTP into a BytesIO object and then use smart_open to write to s3. This is the fastest I can get it so far. Ideally I want the stream reading from FTP to write into S3 at the same time.

Would be great if smart_open can read FTP in the same way as the following example (apologies the correct syntax indent not working in this comment box):

with open('ftp://user@host/smart_open/tests/test_data/1984.csv') as fin:
    with open('s3://bucket-name/smart_open/tests/test_data/1984.csv', 'w') as fout:
        for line in fin:
            fout.write(line)

t4nujms avatar Jul 18 '21 19:07 t4nujms

+1 ftplib currently has function retrbinary() where I currently download from FTP into a BytesIO object and then use smart_open to write to s3. This is the fastest I can get it so far. Ideally I want the stream reading from FTP to write into S3 at the same time.

Would be great if smart_open can read FTP in the same way as the following example (apologies the correct syntax indent not working in this comment box):

with open('ftp://user@host/smart_open/tests/test_data/1984.csv') as fin:
    with open('s3://bucket-name/smart_open/tests/test_data/1984.csv', 'w') as fout:
        for line in fin:
            fout.write(line)

Sort of related, but https://docs.python.org/3/library/shutil.html#shutil.copyfileobj is fantastic (and considerably faster) for copying between two file like objects like you're describing.

danielloader avatar Jul 31 '21 14:07 danielloader

Hi everybody, is this issue still open? I know that this thread is quite old, but I still don't see this feature and I think it would be useful to add. I would love to work on it if possible.

RachitSharma2001 avatar Sep 11 '22 20:09 RachitSharma2001

As an update, I currently have a PR here that allows for opening ftp files using smart_open.

RachitSharma2001 avatar Sep 16 '22 19:09 RachitSharma2001