gravitino
gravitino copied to clipboard
[Bug report] hive catalog include iceberg table?
Version
main branch
Describe what's wrong
a schema in hive catalog have the iceberg table
but iceberg catalog dont't hava hive table
Error message and/or stacktrace
empty
How to reproduce
use beeline to create hive table and iceberg table in the same database
Additional context
No response
@mchades Can you please take a look. From a cursory glance, I feel that Hive catalog should filter out non-hive table when fetching from HMS, WDYT?
@mygrsun do you want take a try if you want to fix it?
@mchades Can you please take a look. From a cursory glance, I feel that Hive catalog should filter out non-hive table when fetching from HMS, WDYT?
Does the table in HMS not belong to Hive? How to distinguish whether a table in HMS belongs to Hive or Iceberg? If it is distinguished by the values of InputFormat and OutputFormat properties, then what kind of table should an Iceberg table created through Hive belong to?
there is a reserved property or others to distinguish whether it is a Hive table or Iceberg. For hudi or others, I think they should also have a flag to differentiate.
If I directly show tables
in Hive, can I also see the Iceberg table?
I guess it will, you can take a try. Probably you can list iceberg table in hive, but not from Iceberg catalog.
Iceberg catalog use a specific parameter table_type
to check whether it's Iceberg table
List<String> tableNames = clients.run(client -> client.getAllTables(database));
List<TableIdentifier> tableIdentifiers;
if (listAllTables) {
tableIdentifiers =
tableNames.stream()
.map(t -> TableIdentifier.of(namespace, t))
.collect(Collectors.toList());
} else {
List<Table> tableObjects =
clients.run(client -> client.getTableObjectsByName(database, tableNames));
tableIdentifiers =
tableObjects.stream()
.filter(
table ->
table.getParameters() != null
&& BaseMetastoreTableOperations.ICEBERG_TABLE_TYPE_VALUE
.equalsIgnoreCase(
table
.getParameters()
.get(BaseMetastoreTableOperations.TABLE_TYPE_PROP)))
.map(table -> TableIdentifier.of(namespace, table.getTableName()))
.collect(Collectors.toList());
}
@mygrsun How do you distinguish between Hive tables and Iceberg tables, and what behavior do you expect?
@mygrsun How do you distinguish between Hive tables and Iceberg tables, and what behavior do you expect?
we want to get the distinguish list of iceberg and hive。I think the way provided by FANNG1 is ok
@mygrsun do you want to fix this?
@mygrsun do you want to fix this?
yes ,i have the plan to fix it.
@mygrsun do you want to fix this?
yes ,i have the plan to fix it.
great! Can your fix catch up with the 0.5.1 release? We plan to release it this week
check my design ,please.
To be able to list both all tables and just list hive tables without iceberg.
my design is add a property in the catalog property .
using the property to control list all table or just list hive table without iceberg.
the property name is:list-table-with-iceberg
public static final String LIST_TABLE_WITH_ICEBERG = "list-table-with-iceberg";
do you think this is ok? @FANNG1 @mchades
check my design ,please.
To be able to list both all tables and just list hive tables without iceberg.
my design is add a property in the catalog property . using the property to control list all table or just list hive table without iceberg. the property name is:list-table-with-iceberg
public static final String LIST_TABLE_WITH_ICEBERG = "list-table-with-iceberg";
I saw that the Iceberg community has also encountered similar issues before. It is worth noting that when there are too many tables, filtering tables may cause performance issues.
So I think we should add a list-all-tables
property with a default value of true
in the Hive catalog. This is consistent with the behavior of the Hive client, and users can set it to false
when they need to filter. WDYT? @mygrsun @FANNG1 @jerryshao
check my design ,please. To be able to list both all tables and just list hive tables without iceberg. my design is add a property in the catalog property . using the property to control list all table or just list hive table without iceberg. the property name is:list-table-with-iceberg
public static final String LIST_TABLE_WITH_ICEBERG = "list-table-with-iceberg";
I saw that the Iceberg community has also encountered similar issues before. It is worth noting that when there are too many tables, filtering tables may cause performance issues.
So I think we should add a
list-all-tables
property with a default value oftrue
in the Hive catalog. This is consistent with the behavior of the Hive client, and users can set it tofalse
when they need to filter. WDYT? @mygrsun @FANNG1 @jerryshao
i think is okay.
@mygrsun do you want to fix this?
yes ,i have the plan to fix it.
great! Can your fix catch up with the 0.5.1 release? We plan to release it this week
yes,i can。