stellar-etl icon indicating copy to clipboard operation
stellar-etl copied to clipboard

Add Support for S3 Datastore Ingestion

Open sydneynotthecity opened this issue 8 months ago • 2 comments

The current stellar-etl cli can only pull data from GCS because there was originally only a GCS BufferedStorageBackend. Support was recently added for S3, so we should extend that support to stellar-etl.

Additionally, Goldsky just enabled its open, public bucket in S3. Support for S3 buckets allow operators in the ecosystem have a free, public source to extract and transform data.

Update the CreateDatastore() method to be configurable. Developers should be able to pass flags via CLI to specify whether they are ingesting from GCS or S3. GCS can remain the default so that internal processes do not break.

Updates to the params for CreateDatastore() should reflect the finalized SEP-0054 standard. Currently, the DataStoreSchema is a fixed object. This should also be refactored to allow for different datastore configurations. Goldsky's happens to align with SDF's internal data store, however that may not always be the case.

sydneynotthecity avatar Jul 22 '25 14:07 sydneynotthecity

Update ticket to also include adding s3 export support. Make the ticket more "good first issue" friendly

Random list of things that probably need updating for exporting to s3

Alternatively we can update all the export code to use stellar/go/support/datastore which defines a standard interface for getting/putting files

chowbao avatar Sep 02 '25 16:09 chowbao

Dependent on https://github.com/stellar/ops/issues/4183

amishas157 avatar Oct 07 '25 17:10 amishas157