trino
trino copied to clipboard
Proposal to Optimize Trino Hive Metastore Query Latency by Caching createMetastoreClient()
Upon analyzing Hive and Iceberg queries in Trino, we observed a significant latency during the analyze phase in io.trino.plugin.hive.metastore.cache.CachingHiveMetastore#loadTable as shown below:
@Override
public Optional<Table> getTable(String databaseName, String tableName)
{
try {
return retry()
.stopOn(NoSuchObjectException.class)
.stopOnIllegalExceptions()
.run("getTable", stats.getGetTable().wrap(() -> {
try (ThriftMetastoreClient client = createMetastoreClient()) { // Takes 167 ms
return Optional.of(client.getTable(databaseName, tableName)); // Takes 20+ ms
}
}));
}
......
}
From the section of code above, we see that the majority of time is consumed at ThriftHiveMetastore:createMetastoreClient()
, which is about 167 milliseconds, whereas the actual request to retrieve the table via ThriftMetastoreClient:getTable()
consumes significantly less time - around 20+ ms.
Looking at the trace logs:
`---ts=2023-12-22 17:52:34;thread_name=Query-20231222_095234_00058_j8b4z-55744;id=d9c0;is_daemon=true;priority=5;TCCL=io.trino.server.PluginClassLoader@1a2d02d8
`---[145.533198ms] io.trino.plugin.hive.metastore.thrift.ThriftHiveMetastore:createMetastoreClient()
`---[145.454194ms] io.trino.plugin.hive.metastore.thrift.IdentityAwareMetastoreClientFactory:createMetastoreClientFor() #2126
We noticed that every transaction creates a new CachingHiveMetastore, meaning the acceleration effect of this cache is only effective within a single query which can't provide acceleration effect for this scenario.
Given the bottleneck at createMetastoreClient(), we're considering an optimization: caching the result of createMetastoreClient(), to avoid recreating the ThriftMetastoreClient for each request. We are interested in your thoughts on this proposed solution.
cc @electrum @dain @raunaqmorarka
Are you able to tell why creation of the client takes so much time?
Are you able to tell why creation of the client takes so much time?
@electrum I haven't delved into the reasons yet.
We used Waggledance (version 3.7.0) as the Metastore(version 3.1.2) router; Trino is actually connected to Waggledance. The Trino cluster, Waggledance, and Metastore are all in the same AZ, and there is no network bandwidth bottleneck.
The following is the time trace for Trino to create a connection with Waggledance, and it can be seen that the delay is almost on the Waggledance server side. I will continue to track the connection time of Waggledance.