Error: `table_type` missing from table parameters when loading table from Hive metastore
Apache Iceberg version
main (development)
Please describe the bug 🐞
I'm (a user of tap-iceberg is) running into the following error when trying to load a Hive table using pyiceberg.
pyiceberg.exceptions.NoSuchPropertyException: Property table_type missing, could not determine type: bronze.my_iceberg_table
The call in question is https://github.com/shaped-ai/tap-iceberg/blob/38064b3aaca5394ba1482970e790d3e2f6020946/tap_iceberg/tap.py#L94.
It seems the loaded table is missing the table type parameter in
https://github.com/apache/iceberg-python/blob/d587e6724685744918ecf192724437182ad01abf/pyiceberg/catalog/hive.py#L329-L331
?
Thanks in advance if this turns out to be user error 😃
In load_table, there's a 2 step process. First it fetches from HMS using get_table, then it converts the hive table into iceberg (_convert_hive_into_iceberg).
https://github.com/apache/iceberg-python/blob/d587e6724685744918ecf192724437182ad01abf/pyiceberg/catalog/hive.py#L524-L527
The error here is the 2nd step. It is expected that the hive table has a property "table_type" and maps to the string "iceberg".
https://github.com/apache/iceberg-python/blob/d587e6724685744918ecf192724437182ad01abf/pyiceberg/catalog/hive.py#L274-L285
Who created the table in this case? When PyIceberg creates the table, it injects the table_type property
https://github.com/apache/iceberg-python/blob/d587e6724685744918ecf192724437182ad01abf/pyiceberg/catalog/hive.py#L373
Who created the table in this case? When PyIceberg creates the table, it injects the
table_typeproperty
I suppose it was created by a third-party and not by HiveCatalog.create_table. Are only tables created by pyiceberg supported here?
Are only tables created by pyiceberg supported here?
Anyone can create an iceberg table using HMS, which can be read by PyIceberg. In HMS, the assumption is that iceberg tables have a specific property set so that engines can distinguish between hive and iceberg tables.
In this case, the table was created as a "hive table" and not an "iceberg table".
Anyone can create an iceberg table using HMS, which can be read by PyIceberg. In HMS, the assumption is that iceberg tables have a specific property set so that engines can distinguish between hive and iceberg tables.
In this case, the table was created as a "hive table" and not an "iceberg table".
@kevinjqliu thanks for the info 🙏. Just two more questions:
- is there a way to set this property manually?
- would doing that break something?
is there a way to set this property manually?
You can use an engine (like Spark/Trino) to interact with the Hive table to add the extra table parameter. Alternatively, a hacky way is to use hive client in pyiceberg. Like so https://github.com/apache/iceberg-python/blob/d587e6724685744918ecf192724437182ad01abf/pyiceberg/catalog/hive.py#L570-L574 it should work, but definitely test it out first
would doing that break something?
Nope, adding that specific parameter to HMS is how the iceberg table is defined. You can see example in the core iceberg library and in an engine like Trino