presto
presto copied to clipboard
"hive"."information_schema"."columns" with 'like' hangs
We, at LendingClub, using Presto "0.157.1.t.0.5-1.x86_64" and we access Presto with JDBC in SQLWorkBench. while we are trying to list Columns in any database in "Database Explorer" , most of the time it hangs.
We saw in Discovery URI the query its trying is
SELECT * FROM "hive"."information_schema"."columns" WHERE table_schema like '<schema_name>' AND table_name LIKE '<table_name>' AND column_name LIKE '%'
In this scenario, most of the time this hangs .
We also reproduced the issue by smaller query
SELECT * FROM "hive"."information_schema"."columns" WHERE table_schema like '<schema_name>'
hangs
But
SELECT * FROM "hive"."information_schema"."columns" WHERE table_schema = '<schema_name>'
send result immediately
How many tables you have in your Hive setup? Do you have any non-ascii chars in table names maybe? Can you provide a Hive setup to reproduce the problem? E.g. what tables should I create in Hive to reproduce it?
My example setup works fine (no delay):
presto> SELECT * FROM "hive"."information_schema"."columns" WHERE table_schema like 'default';
table_catalog | table_schema | table_name | column_name | ordinal_position | column_default | is_nullable | data_type | comment
---------------+--------------+------------+-------------+------------------+----------------+-------------+-----------+---------
hive | default | foo | a | 1 | NULL | YES | bigint | NULL
(1 row)
Also, do you use JDBC drivers distributed along with TD release or open source version?
Our HiveMetaStore is pretty huge
- Number of Schemas ~ 1000
- Number of Tables 200K
- Number of Columns ~10M
We noticed, whenever we use
SELECT * FROM "hive"."information_schema"."columns" WHERE table_schema like '<schema_name>'
Presto send a call for each combination of
cmd=get_table : db= So there are too many calls to Hive Metastore and either it hangs or timeout or sometime we get error like [Teradata]Presto Query failed: outputFormat should not be accessed from a null StorageFormat. [SQL State=HY000, DB Errorcode=100050] we are using Teradata JDBC to Hive Metastore
What's your caching policy (hive.metastore-cache-ttl
in catalog/hive.properties
)?
This could help with reducing the load on subsequent queries, however with your setup of 200k tables and 1k schemas query like ... where table_shema like ...
will have to retrieve all possible schema x table combinations from metastore. The way to solve that is to have more specific filtering predicate, i.e. having some of the predicates of equality type instead of like
.
E.g. where table_schema like <schema_name> and table_name = <table_name>
would reduce the time of query significantly because it would need to scan only 1000 entries in metastore instead of 200 million.
Is the query you presented generated by SQLWorkBench itself?
Thanks for your reply.
We have not set hive.metastore-cache-ttl
, its default. What would be the appropriate value for this
Do we also need to change hive.metastore-timeout
?
For second part - Actually we found that where table_schema like <schema_name> and table_name = <table_name>
would be better choice , but the problem is
Presto JDBC is sending like
query while exploring Columns in SQLWorkbench in "Database Explorer"
Default value for metastore-cache-ttl
is 1 hour which should be sufficiently long for you.
The general idea behind this property is to avoid subsequent queries to the metastore. Instead, the schema is resolved from cache. There's an obvious speed up because you don't have to query the same metadata twice within the same cache-ttl
time frame. However, you may see outdated schema if it changed after last cache refresh but before cache-ttl
time elapsed.
E.g. with cache-ttl = 1h:
At time t0 = 0 you query metadata, it takes long time to retrieve but it's stored for future use.
At time t1 = 10 min you query the same metadata again, then it's fast because it's retrieved from cache
At time t2 = 15 min you modify schema (e.g. add/remove column in some table)
At time t3 = 20 min (still before cache-ttl
time since last cache retrieval) you query metadata again and see the same results as in t0 and t1 (without the change from t2 reflected).
At time t4 = 60 min query will refresh the cache (takes longer than yet before) and the change from t2 starts to be visible.
metastore-timeout
should be increased only when the default (10s) time is not sufficient for restoring metadata from Hive Metastore. This could be your case. However, you want to lower the time spend on retrieving metadata, not only suppress errors caused by its longer than usual operation time.
We are looking into the JDBC drivers to make them perform better in your use case. However, the earliest possible patch would be in the next release (after 157t).
Thanks for your help. Any suggestion how we can resolve this issue for now . All our customers are having very bad experience as SQLWorkbench is hung.
Seems like queries with "like" schema name is passed as Empty or "*" , so it is processing all the tables from all the schema
Hi @rschlussel any pointer to solve this issue , Thanks
@rajitsaha We believe this is a driver issue. We are working with the vendor to resolve the issue. We will update this issue once we have more information.
Thanks