smart_open
smart_open copied to clipboard
Support for reading and writing files directly to/from ftp
Another feature that I would like to have in smart_open is support for direct reading/writing of the data to/from ftp.
Anyone else interested in this? Does this make sense?
I thought about making a scp interface in the worst possible way, (make a shell command).
I'd be interested in seeing the implementation for this. I've used pysftp in the past for such tasks and it is very slow for large files (you're better of with sftp)
Is this issue still relevant today? I'd like to work on that if that would be the case :)
+1 ftplib currently has function retrbinary() where I currently download from FTP into a BytesIO object and then use smart_open to write to s3. This is the fastest I can get it so far. Ideally I want the stream reading from FTP to write into S3 at the same time.
Would be great if smart_open can read FTP in the same way as the following example (apologies the correct syntax indent not working in this comment box):
with open('ftp://user@host/smart_open/tests/test_data/1984.csv') as fin:
with open('s3://bucket-name/smart_open/tests/test_data/1984.csv', 'w') as fout:
for line in fin:
fout.write(line)
+1 ftplib currently has function retrbinary() where I currently download from FTP into a BytesIO object and then use smart_open to write to s3. This is the fastest I can get it so far. Ideally I want the stream reading from FTP to write into S3 at the same time.
Would be great if smart_open can read FTP in the same way as the following example (apologies the correct syntax indent not working in this comment box):
with open('ftp://user@host/smart_open/tests/test_data/1984.csv') as fin: with open('s3://bucket-name/smart_open/tests/test_data/1984.csv', 'w') as fout: for line in fin: fout.write(line)
Sort of related, but https://docs.python.org/3/library/shutil.html#shutil.copyfileobj is fantastic (and considerably faster) for copying between two file like objects like you're describing.
Hi everybody, is this issue still open? I know that this thread is quite old, but I still don't see this feature and I think it would be useful to add. I would love to work on it if possible.
As an update, I currently have a PR here that allows for opening ftp files using smart_open.