Inconsistency in catalog.list_tables Behavior Across Python and Java: Returns Non-Iceberg Tables in Python Only
Feature Request / Improvement
I noticed that in python, hive, glue and dynamo list all tables, including non-Iceberg ones, in the namespace
https://github.com/apache/iceberg-python/blob/acc934fb76aa6c6e2e32b60c8a99f9e2b2c627dd/pyiceberg/catalog/hive.py#L488-L504
https://github.com/apache/iceberg-python/blob/acc934fb76aa6c6e2e32b60c8a99f9e2b2c627dd/pyiceberg/catalog/glue.py#L584-L613
However, in java, we apply a filter to only return Iceberg tables in the given namespace: GlueCatalog.listTables HiveCatalog.listTables
I forgot if we discussed this before: Why do we choose to include non-iceberg tables in the result in python?
cc @Fokko
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.
Why do we choose to include non-iceberg tables in the result in python?
I don't think we should. Using HMS for both hive and iceberg tables is pretty common, we should filter to return only iceberg tables
I'd like to work on this, if it's possible
@mark-major sure thing, assigned to you