presto "hive"."information_schema"."columns" with 'like' hangs

We, at LendingClub, using Presto "0.157.1.t.0.5-1.x86_64" and we access Presto with JDBC in SQLWorkBench. while we are trying to list Columns in any database in "Database Explorer" , most of the time it hangs.

We saw in Discovery URI the query its trying is

SELECT * FROM "hive"."information_schema"."columns" WHERE table_schema like '<schema_name>' AND table_name LIKE '<table_name>' AND column_name LIKE '%'

In this scenario, most of the time this hangs .

We also reproduced the issue by smaller query

SELECT * FROM "hive"."information_schema"."columns" WHERE table_schema like '<schema_name>' hangs

But

SELECT * FROM "hive"."information_schema"."columns" WHERE table_schema = '<schema_name>'

send result immediately

Jan 18 '17 08:01 rajitsaha

How many tables you have in your Hive setup? Do you have any non-ascii chars in table names maybe? Can you provide a Hive setup to reproduce the problem? E.g. what tables should I create in Hive to reproduce it?

My example setup works fine (no delay):

presto> SELECT * FROM "hive"."information_schema"."columns" WHERE table_schema like 'default';
 table_catalog | table_schema | table_name | column_name | ordinal_position | column_default | is_nullable | data_type | comment
---------------+--------------+------------+-------------+------------------+----------------+-------------+-----------+---------
 hive          | default      | foo        | a           |                1 | NULL           | YES         | bigint    | NULL
(1 row)

Also, do you use JDBC drivers distributed along with TD release or open source version?

Jan 18 '17 15:01 maciejgrzybek

Our HiveMetaStore is pretty huge

Number of Schemas ~ 1000
Number of Tables 200K
Number of Columns ~10M

We noticed, whenever we use
SELECT * FROM "hive"."information_schema"."columns" WHERE table_schema like '<schema_name>'

Presto send a call for each combination of cmd=get_table : db= tbl= to Hive Metastore
So there are too many calls to Hive Metastore and either it hangs or timeout or sometime we get error like
[Teradata]Presto Query failed: outputFormat should not be accessed from a null StorageFormat. [SQL State=HY000, DB Errorcode=100050]
we are using Teradata JDBC

Jan 18 '17 18:01 rajitsaha

What's your caching policy (hive.metastore-cache-ttl in catalog/hive.properties)? This could help with reducing the load on subsequent queries, however with your setup of 200k tables and 1k schemas query like ... where table_shema like ... will have to retrieve all possible schema x table combinations from metastore. The way to solve that is to have more specific filtering predicate, i.e. having some of the predicates of equality type instead of like. E.g. where table_schema like <schema_name> and table_name = <table_name> would reduce the time of query significantly because it would need to scan only 1000 entries in metastore instead of 200 million. Is the query you presented generated by SQLWorkBench itself?

Jan 18 '17 19:01 maciejgrzybek

Thanks for your reply.

We have not set hive.metastore-cache-ttl , its default. What would be the appropriate value for this Do we also need to change hive.metastore-timeout?

For second part - Actually we found that where table_schema like <schema_name> and table_name = <table_name> would be better choice , but the problem is

Presto JDBC is sending like query while exploring Columns in SQLWorkbench in "Database Explorer"

Jan 18 '17 19:01 rajitsaha

Default value for metastore-cache-ttl is 1 hour which should be sufficiently long for you. The general idea behind this property is to avoid subsequent queries to the metastore. Instead, the schema is resolved from cache. There's an obvious speed up because you don't have to query the same metadata twice within the same cache-ttl time frame. However, you may see outdated schema if it changed after last cache refresh but before cache-ttl time elapsed. E.g. with cache-ttl = 1h: At time t0 = 0 you query metadata, it takes long time to retrieve but it's stored for future use. At time t1 = 10 min you query the same metadata again, then it's fast because it's retrieved from cache At time t2 = 15 min you modify schema (e.g. add/remove column in some table) At time t3 = 20 min (still before cache-ttl time since last cache retrieval) you query metadata again and see the same results as in t0 and t1 (without the change from t2 reflected). At time t4 = 60 min query will refresh the cache (takes longer than yet before) and the change from t2 starts to be visible.

metastore-timeout should be increased only when the default (10s) time is not sufficient for restoring metadata from Hive Metastore. This could be your case. However, you want to lower the time spend on retrieving metadata, not only suppress errors caused by its longer than usual operation time.

We are looking into the JDBC drivers to make them perform better in your use case. However, the earliest possible patch would be in the next release (after 157t).

Jan 18 '17 20:01 maciejgrzybek

Thanks for your help. Any suggestion how we can resolve this issue for now . All our customers are having very bad experience as SQLWorkbench is hung.

Jan 18 '17 22:01 rajitsaha

Seems like queries with "like" schema name is passed as Empty or "*" , so it is processing all the tables from all the schema

Jan 19 '17 20:01 rajitsaha

Hi @rschlussel any pointer to solve this issue , Thanks

Feb 01 '17 23:02 rajitsaha

@rajitsaha We believe this is a driver issue. We are working with the vendor to resolve the issue. We will update this issue once we have more information.

Feb 02 '17 02:02 mattsfuller

Thanks

Feb 02 '17 20:02 rajitsaha

presto presto copied to clipboard

"hive"."information_schema"."columns" with 'like' hangs

presto
presto copied to clipboard