airflow
airflow copied to clipboard
Add FTPOperator and FTPSOperator
Description
There already exists an SFTPOperator (documentation here) which provides an easy way to read data from an SFTP Server to the local disk or write data to an SFTP server. However, for FTP and FTPS Servers, no such operator exists. Rather, there only exists ftp hooks and sensors (look within the airflow/airflow/providers/ftp
directory, and compare to the airflow/airflow/providers/sftp
directory).
Use case/motivation
I am hoping to provide the following two operators for Airflow Developers:
FTPOperator(
task_id="operation",
ftp_conn_id="ftp_default",
local_filepath="route_to_local_file",
remote_filepath="remote_route_to_copy",
operation="put",
dag=dag
)
FTPSOperator(
task_id="operation",
ftps_conn_id="ftps_default",
local_filepath="route_to_local_file",
remote_filepath="remote_route_to_copy",
operation="put",
dag=dag
)
The FTP Operator would connect to an FTP server with no encryption protocol, and will copy files from that server to local disk (if the operation is "get") or will copy a file on local disk to the server (if the operation is "put"). The FTPS Operator would do the same thing but for an FTP Server with TLS encryption protocol.
Related issues
No response
Are you willing to submit a PR?
- [X] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Thanks for opening your first issue here! Be sure to follow the issue template!
Sounds great! Feel free to submit a PR.
FYI at a previous company I implemented such operators and at a bare minimum I found it important to at least check the size of the file transferred was the same as the remote server reported: https://docs.python.org/3/library/ftplib.html#ftplib.FTP.size. In my experience FTP was more susceptible to things going wrong midtransfer.
I also implemented other integrity checks such as seeing if the server supported HASH algorithms, but I found that supporting such features like that required a bit of battle testing against many different types of FTP servers as you could get very unexpected results (such as saying they supported it but only ever returning 0, or providing the result with an additional prefix or postfix string).