trino-gateway icon indicating copy to clipboard operation
trino-gateway copied to clipboard

Error to route when requestAnalyzerConfig is True

Open ndrluis opened this issue 1 year ago • 4 comments

Hello,

I activated the io.trino.gateway.ha.module.QueryCountBasedRouterProvider and have two routing groups, each with one cluster.

I also have a rule that directs queries from the "dbt-trino" source to the large routing group. However, when I execute multiple queries, some of them are being routed to the adhoc routing group. Is this expected behavior, considering that the QueryCount router might have higher priority than the routing rule, or is this a bug?

Routing Rule

---
name: "dbt"
description: "if query from dbt"
condition: "request.getHeader(\"X-Trino-Source\").startsWith(\"dbt-trino\")"
actions:
  - "result.put(\"routingGroup\", \"large\")"

Clusters image

Routing History image (1)

ndrluis avatar Sep 26 '24 20:09 ndrluis

I removed the QueryCountBasedRouterProvider, but the latest query from dbt is being routed to the adhoc group. The difference is that the latest query is a CREATE TABLE statement. I conducted 5 tests, and in all of them, the latest query was routed to adhoc.

I feel that it might just be a coincidence, because when I try to run just a CREATE TABLE, the query is routed to the large routing group.

ndrluis avatar Sep 26 '24 21:09 ndrluis

Sometimes this error appears in the log

2024-09-26T22:50:44.831Z	ERROR	http-worker-62	io.trino.gateway.ha.router.RuleReloadingRoutingGroupSelector	Error opening rules configuration file, using routing group header as default.
com.google.common.base.VerifyException: Identifier cannot be empty or null
	at com.google.common.base.Verify.verify(Verify.java:126)
	at io.trino.sql.tree.Identifier.isValidIdentifier(Identifier.java:133)
	at io.trino.sql.tree.Identifier.<init>(Identifier.java:56)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:212)
	at java.base/java.util.AbstractList$RandomAccessSpliterator.forEachRemaining(AbstractList.java:722)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:556)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:546)
	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:265)
	at java.base/java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:702)
	at io.trino.sql.tree.QualifiedName.of(QualifiedName.java:42)
	at io.trino.gateway.ha.router.TrinoQueryProperties.qualifyName(TrinoQueryProperties.java:386)
	at io.trino.gateway.ha.router.TrinoQueryProperties.getNames(TrinoQueryProperties.java:335)
	at io.trino.gateway.ha.router.TrinoQueryProperties.getNames(TrinoQueryProperties.java:341)
	at io.trino.gateway.ha.router.TrinoQueryProperties.getNames(TrinoQueryProperties.java:341)
	at io.trino.gateway.ha.router.TrinoQueryProperties.getNames(TrinoQueryProperties.java:341)
	at io.trino.gateway.ha.router.TrinoQueryProperties.getNames(TrinoQueryProperties.java:341)
	at io.trino.gateway.ha.router.TrinoQueryProperties.getNames(TrinoQueryProperties.java:341)
	at io.trino.gateway.ha.router.TrinoQueryProperties.getNames(TrinoQueryProperties.java:341)
	at io.trino.gateway.ha.router.TrinoQueryProperties.processRequestBody(TrinoQueryProperties.java:197)
	at io.trino.gateway.ha.router.TrinoQueryProperties.<init>(TrinoQueryProperties.java:140)
	at io.trino.gateway.ha.router.RuleReloadingRoutingGroupSelector.findRoutingGroup(RuleReloadingRoutingGroupSelector.java:99)
	at io.trino.gateway.ha.handler.RoutingTargetHandler.getBackendFromRoutingGroup(RoutingTargetHandler.java:86)
	at io.trino.gateway.ha.handler.RoutingTargetHandler.lambda$getRoutingDestination$0(RoutingTargetHandler.java:66)
	at java.base/java.util.Optional.orElseGet(Optional.java:364)
	at io.trino.gateway.ha.handler.RoutingTargetHandler.getRoutingDestination(RoutingTargetHandler.java:66)
	at io.trino.gateway.proxyserver.RouteToBackendResource.postHandler(RouteToBackendResource.java:67)
	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52)
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:146)
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:189)
	at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$VoidOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:159)
	at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:93)
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:478)
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:400)
	at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:81)
	at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:274)
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)
	at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
	at org.glassfish.jersey.internal.Errors.process(Errors.java:274)
	at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
	at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:266)
	at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:253)
	at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:696)
	at org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:397)
	at org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:349)
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:358)
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:312)
	at org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:205)
	at org.eclipse.jetty.ee10.servlet.ServletHolder.handle(ServletHolder.java:736)
	at org.eclipse.jetty.ee10.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1614)
	at org.eclipse.jetty.ee10.servlet.ServletHandler$MappedServlet.handle(ServletHandler.java:1547)
	at org.eclipse.jetty.ee10.servlet.ServletChannel.dispatch(ServletChannel.java:824)
	at org.eclipse.jetty.ee10.servlet.ServletChannel.handle(ServletChannel.java:436)
	at org.eclipse.jetty.ee10.servlet.ServletHandler.handle(ServletHandler.java:464)
	at org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:597)
	at org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1060)
	at org.eclipse.jetty.server.Handler$Wrapper.handle(Handler.java:740)
	at org.eclipse.jetty.server.handler.EventsHandler.handle(EventsHandler.java:81)
	at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:151)
	at org.eclipse.jetty.server.Handler$Wrapper.handle(Handler.java:740)
	at org.eclipse.jetty.server.handler.EventsHandler.handle(EventsHandler.java:81)
	at org.eclipse.jetty.server.Server.handle(Server.java:181)
	at org.eclipse.jetty.server.internal.HttpChannelState$HandlerInvoker.run(HttpChannelState.java:648)
	at org.eclipse.jetty.server.internal.HttpConnection.onFillable(HttpConnection.java:403)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:322)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:99)
	at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:478)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:441)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:293)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:201)
	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:311)
	at org.eclipse.jetty.util.thread.MonitoredQueuedThreadPool$1.run(MonitoredQueuedThreadPool.java:73)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:979)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1209)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1164)
	at java.base/java.lang.Thread.run(Thread.java:1570)

ndrluis avatar Sep 26 '24 22:09 ndrluis

I discovered that this error only happens when the requestAnalyzerConfig is set to True. Here is an example of the query:

CREATE OR REPLACE TABLE "catalog"."database"."example_table"

WITH (
      "extra_properties" = MAP(
        ARRAY['optimizer.enabled', 'compaction.enabled'],
        ARRAY['false', 'false']
      ))
AS (

WITH cte_example_sheets AS (
  SELECT *
  FROM "catalog"."database"."example_source"
)

, cte_example AS (
  SELECT
    CAST(field1 AS VARCHAR) AS alias1
    , CAST(field2 AS VARCHAR) AS alias2
    , CAST(field3 AS VARCHAR) AS alias3
    , CAST(SPLIT_PART(field4, '-', 1) AS VARCHAR) AS alias4
    , CAST(SPLIT_PART(field4, '-', 2) AS VARCHAR) AS alias5
    , CAST(field5 AS VARCHAR) AS alias6
    , CAST(DATE_PARSE(field6, '%Y-%m-%d %H:%i') AS TIMESTAMP(6)) AS alias7
    , CAST(DATE_PARSE(field7, '%Y-%m-%d %H:%i') AS TIMESTAMP(6)) AS alias8
    , CAST(DATE_PARSE(field8, '%Y-%m-%d %H:%i') AS TIMESTAMP(6)) AS alias9
    , CAST(DATE_PARSE(field9, '%Y-%m-%d %H:%i') AS TIMESTAMP(6)) AS alias10
    , CAST(field10 AS VARCHAR) AS alias11
    , CAST(field11 AS VARCHAR) AS alias12
    , CAST(field12 AS VARCHAR) AS alias13
  FROM cte_example_sheets
)

SELECT * FROM cte_example
)

The problem is that the exact same query runs without error when I use the Trino CLI as the client, but it returns an error every time I use the DBT client.

ndrluis avatar Sep 30 '24 17:09 ndrluis

This is still a problem in trino-gateway 16 for certain types of queries if default schema is (incorrectly) set to an empty string.

Invalid queries, with an undefined identifier in some positions

...
, some_cte AS (
    SELECT some_column FROM some_table st
    LEFT JOIN undefined_identifier ui ON ui.some_column = st.some_column
),
...

and queries with the following pattern

SELECT * FROM TABLE(exclude_columns(...)) ...

fail with an unhandled

com.google.common.base.VerifyException: Identifier cannot be empty or null

before they reach the Trino cluster.

vaultah avatar Nov 07 '25 13:11 vaultah