fix(cli/rest) Support Glue REST operations with Iceberg-Go CLI
Motivation
To Support AWS Glue Iceberg endpoint using Iceberg-Go CLI. Forbidden error is thrown when Iceberg tables in glue are queried using CLI.
https://docs.aws.amazon.com/glue/latest/dg/connect-glu-iceberg-rest.html
Fix:
- Added
RestCatalogConfigstruct toCatalogConfigstruct RestCatalogConfigcontains rest catalog configuration properties.- Above values are used before creating rest catalog
- Use AWS environment credentials when Credentials property is not set
Testing:
- Added unit test for new configuration
- Used below
rest_config.yamlto query Iceberg tables in AWS Glue.
catalog:
default:
type: rest
uri: https://glue.us-east-1.amazonaws.com/iceberg
region: us-east-1
warehouse: YOUR_AWS_ACCOUNT_ID
rest-config:
sigv4-region: us-east-1
sigv4-service: glue
I'm doubtful about this approach. Can we pass this via flags? This makes the CLI less unified. I think it's better to keep this approach consistent — by using flags.
Thanks @laskoviymishka for reviewing !
Yes, that was my initial approach as well to make CLI args and config files to be as close as possible. Since we may end up in passing too many arguments, i used this approach.
Do you think adding these REST config arguments as a JSON would be easier ?
I think this is still a problem. From what I see in other Glue clients, parameters are typically passed as CLI flags instead of a JSON file. CLI flags are more self-describing (you can view them directly with --help) and much more convenient to use.
Currently, using a custom JSON schema adds an unnecessary layer of complexity - it's less clear to the user and makes maintenance more cumbersome.
Could we consider following the pattern used by other tools in the ecosystem?
See for example Iceberg Catalog Migrator
java -jar iceberg-catalog-migrator-cli.jar migrate \
--source-catalog-type GLUE \
--source-catalog-properties warehouse=s3a://example-bucket/gluecatalog/,io-impl=org.apache.iceberg.aws.s3.S3FileIO \
--target-catalog-type NESSIE \
--target-catalog-properties uri=http://...,warehouse=s3a://...,io-impl=...
Another option is simply by-pass this as AWS_* env-vars, as it done in https://py.iceberg.apache.org/cli/
I agree with @laskoviymishka that we shouldn't be passing custom JSON stuff and should instead use flags or just follow whatever pattern pyiceberg is using.
Thank you @laskoviymishka @zeroshade for the suggestion. I have updated the code to follow the pattern in Iceberg Catalog Migrator.
Successfully tested cli using command below,
./iceberg list --catalog rest --uri https://glue.us-east-1.amazonaws.com/iceberg --warehouse <ACCOUNT_ID> --rest-config sigv4-region=us-east-1,sigv4-service=glue