improve pyiceberg CLI
Feature Request / Improvement
Based on issues described in #1771
-
We'd want to make it clear that the
defaultcatalog is used by default when no--catalogparameter is given. For example,pyiceberg listuses thedefaultentry in the.pyiceberg.yamlfile -
We should fix the order of the parameter passed into the CLI. For example,
pyiceberg list --catalog hivedoes not override thecatalogbutpyiceberg --catalog hive listdoes.
Hi, I can work on this issue. Could you assign the issue to me?
sure @iting0321 happy to help review :)
Hi, I have some questions.
If the command is pyiceberg list, I need to read the default entry in the catalog. However, what if default is not set in the catalog?
Additionally, if the command is pyiceberg list --catalog hive, should I simply return a command order error, or should I read the default catalog and return the result as if the command were pyiceberg list at the same time?
Also, I would like to know whether you can provide an example of .pyiceberg.yaml that I can test locally. I am a bit confused about the content of .pyiceberg.yaml. For example, can we set the same uri prefix for both hive and default?
catalog:
hive:
uri: thrift://localhost:9083
s3.endpoint: http://localhost:9100
s3.access-key-id: admin
s3.secret-access-key: adminadmin
s3.region: us-east-1
default:
uri: thrift://default-catalog:9083
@iting0321 heres the current documentation for the CLI https://py.iceberg.apache.org/cli/
In general, the CLI requires a connection to the catalog. This can be done by passing the catalog configs via parameters, such as pyiceberg --uri ... list or by reading from the config file (~/.pyiceberg.yaml).
By default, the CLI will read the default entry in the config file. To read other entries, you can use pyiceberg --catalog foo list
However, what if default is not set in the catalog?
this should error because the CLI cannot connect to any catalog
if the command is pyiceberg list --catalog hive
it would be nice to not enforce the order of the parameters. I think pyiceberg list --catalog hive should work the same as pyiceberg --catalog hive list
Also, I would like to know whether you can provide an example of .pyiceberg.yaml that I can test locally. I am a bit confused about the content of .pyiceberg.yaml. For example, can we set the same uri prefix for both hive and default?
your example looks correct. You can set the same uri if you like. The hive and default are just names you give to the specific configs. You can call it whatever you want as long as you refer to it in the CLI command
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.