airflow icon indicating copy to clipboard operation
airflow copied to clipboard

Create a Cloud Storage Operator that could return a list of objects in a folder

Open lopezvit opened this issue 10 months ago • 2 comments

Description

Create a new operator inside of airflow.providers.google.cloud.operators.gcs that, given a pattern/preffix would return a list of files in said folder, similar to the client method:

storage_client.get_bucket(BUCKET_NAME)
bucket.list_blobs(prefix=filename)

Use case/motivation

No response

Related issues

I have a process that runs once a day that reads some *.csv files from storage and process them. It would be nice to have an operator that would do exactly that, without needing to create custom code for it. When I create the custom code, probably using the storage hook, I can try to paste it here, but I don't have time to create a PR.

Are you willing to submit a PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

lopezvit avatar Apr 27 '24 10:04 lopezvit

Ok, after a bit of googleing I found that it exists!!! GCSListObjectsOperator Then, what I believe it is wrong is the documentation, because I didn't found it here: https://airflow.apache.org/docs/apache-airflow-providers-google/stable/operators/cloud/gcs.html Should I create a new issue for fixing the documentation or can it continue here?

lopezvit avatar Apr 27 '24 11:04 lopezvit

You can just edit this Issue to request that this Operator be added to the list of operators in the docs, I think.

RNHTTR avatar Apr 27 '24 19:04 RNHTTR

I can take this

chaityacshah avatar May 31 '24 07:05 chaityacshah