airflow
airflow copied to clipboard
Create a Cloud Storage Operator that could return a list of objects in a folder
Description
Create a new operator inside of airflow.providers.google.cloud.operators.gcs
that, given a pattern/preffix would return a list
of files in said folder, similar to the client method:
storage_client.get_bucket(BUCKET_NAME)
bucket.list_blobs(prefix=filename)
Use case/motivation
No response
Related issues
I have a process that runs once a day that reads some *.csv files from storage and process them. It would be nice to have an operator that would do exactly that, without needing to create custom code for it. When I create the custom code, probably using the storage hook, I can try to paste it here, but I don't have time to create a PR.
Are you willing to submit a PR?
- [ ] Yes I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Ok, after a bit of googleing I found that it exists!!! GCSListObjectsOperator Then, what I believe it is wrong is the documentation, because I didn't found it here: https://airflow.apache.org/docs/apache-airflow-providers-google/stable/operators/cloud/gcs.html Should I create a new issue for fixing the documentation or can it continue here?
You can just edit this Issue to request that this Operator be added to the list of operators in the docs, I think.
I can take this