trino-gateway icon indicating copy to clipboard operation
trino-gateway copied to clipboard

[Question] Route queries and spooled segment downloads to the same cluster

Open JustinObanor opened this issue 1 year ago • 0 comments

My setup is 1 gateway and 2 trino clusters.

Both clusters have their own cloud storage where spooled segments are stored

I noticed the behaviour that when the gateway makes a request to a cluster to handle a query, the gateway continues to route to that cluster based on the query ID: https://trinodb.github.io/trino-gateway/routing-logic/#routing-based-on-query-identifier-default

The issue I face is that when it's time for the gateway to download the spooled segments, it doesnt route to the same cluster anymore. It routes to the next cluster which results in an error because the next cluster doesn't know anything about those spooled segments

Logs from the gateway:

2025-04-22T16:04:29.346Z    INFO    http-worker-677    io.trino.gateway.ha.handler.RoutingTargetHandler    Rerouting [https://10.60.5.187:8443/v1/statement/executing/20250422_160428_00011_n9kdq/y5ad3e6ec86989f7de2fb769533e74657c15 d798c/2]--> [https://trino.trinoprdt-p.svc.cluster.local:443/v1/statement/executing/20250422_160428_00011_n9kdq/y5a d3e6ec86989f7de2fb769533e74657c15d798c/2]                                                                           
2025-04-22T16:04:29.468Z    INFO    http-worker-677    io.trino.gateway.ha.handler.RoutingTargetHandler    Rerouting [https://10.60.5.187:8443/v1/spooled/download/BJ6Ea1ylqzZjibfRLN4BH_RjfWEm2wHbOEc_SkT-yMc=]--> [https://trino.trinoprdt-p.svc.cluster.local:443/v1/spooled/download/BJ6Ea1ylqzZjibfRLN4BH_RjfWEm2wHbOEc_SkT-yMc=]                     
2025-04-22T16:04:31.353Z    INFO    http-worker-455    io.trino.gateway.ha.handler.RoutingTargetHandler    Rerouting [https://10.60.5.187:8443/v1/spooled/ack/BJ6Ea1ylqzZjibfRLN4BH_RjfWEm2wHbOEc_SkT-yMc=]--> [https://trino.trinoprdt-s.svc.cluster.local:443/v1/spooled/ack/BJ6Ea1ylqzZjibfRLN4BH_RjfWEm2wHbOEc_SkT-yMc=]                              
2025-04-22T16:04:31.373Z    INFO    http-worker-522    io.trino.gateway.ha.handler.RoutingTargetHandler    Rerouting [https://10.60.5.187:8443/v1/spooled/download/yP_MO1fKhsw7oIYUbzkJU_RjfWEm2wHbOEc_SkT-yMc=]--> [https://trino.trinoprdt-s.svc.cluster.local:443/v1/spooled/download/yP_MO1fKhsw7oIYUbzkJU_RjfWEm2wHbOEc_SkT-yMc=]                    
2025-04-22T16:05:13.619Z    INFO    pool-4-thread-1    io.trino.gateway.ha.clustermonitor.ActiveClusterMonitor    Getting stats for all active clusters                      

Logs show I made a query to the first cluster trino.trinoprdt-p, but the gateway attempts to download a segment from the second cluster trino.trinoprdt-s

My backends:

[
  {
    "active": true,
    "routingGroup": "adhoc",
    "externalUrl": "https://justin-trinoprdt-s.abc.xyz.cloud/ui",
    "name": "https://trino.trinoprdt-s.svc.cluster.local:443",
    "proxyTo": "https://trino.trinoprdt-s.svc.cluster.local:443"
  },
  {
    "active": true,
    "routingGroup": "adhoc",
    "externalUrl": "https://justin-trinoprdt-p.abc.xyz.cloud/ui",
    "name": "https://trino.trinoprdt-p.svc.cluster.local:443",
    "proxyTo": "https://trino.trinoprdt-p.svc.cluster.local:443"
  }
]

This gives the error when downloading a segment from the next cluster

DEBUG:urllib3.connectionpool:https://gateway.trinoprdt:8443 "GET /v1/spooled/ack/BJ6Ea1ylqzZjibfRLN4BH_RjfWEm2wHbOEc_SkT-yMc= HTTP/1.1" 200 0
DEBUG:urllib3.connectionpool:https://gateway.trinoprdt:8443 "GET /v1/spooled/download/yP_MO1fKhsw7oIYUbzkJU_RjfWEm2wHbOEc_SkT-yMc= HTTP/1.1" 500 5098
Traceback (most recent call last):
  File "/opt/conda/envs/python3/lib/python3.11/site-packages/trino/client.py", line 1222, in __next__
    return next(self._rows)
           ^^^^^^^^^^^^^^^^
StopIteration

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/dh216100/play2.py", line 33, in <module>
    cur.execute("SELECT * FROM hive.tpcds_europe_west1_1000.store_sales LIMIT 100000")
  File "/opt/conda/envs/python3/lib/python3.11/site-packages/trino/dbapi.py", line 614, in execute
    self._iterator = iter(self._query.execute())
                          ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/python3/lib/python3.11/site-packages/trino/client.py", line 900, in execute
    self._result.rows += self.fetch()
                         ^^^^^^^^^^^^
  File "/opt/conda/envs/python3/lib/python3.11/site-packages/trino/client.py", line 935, in fetch
    return list(SegmentIterator(spooled, self._row_mapper))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/python3/lib/python3.11/site-packages/trino/client.py", line 1226, in __next__
    self._load_next_segment()
  File "/opt/conda/envs/python3/lib/python3.11/site-packages/trino/client.py", line 1239, in _load_next_segment
    self._rows = iter(self._decoder.decode(self._current_segment.segment))
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/python3/lib/python3.11/site-packages/trino/client.py", line 1254, in decode
    return self._decoder.decode(spooled_data.data, spooled_data.metadata)
                                ^^^^^^^^^^^^^^^^^
  File "/opt/conda/envs/python3/lib/python3.11/site-packages/trino/client.py", line 1136, in data
    self._request.raise_response_error(http_response)
  File "/opt/conda/envs/python3/lib/python3.11/site-packages/trino/client.py", line 675, in raise_response_error
    raise exceptions.HttpError(
trino.exceptions.HttpError: error 500: b'java.io.IOException: Segment not found or expired\n\tat io.trino.spooling.filesystem.FileSystemSpoolingManager.checkFileExists(FileSystemSpoolingManager.java:243)\n\tat io.trino.spooling.filesystem.FileSystemSpoolingManager.openInputStream(FileSystemSpoolingManager.java:134)\n\tat io.trino.server.protocol.spooling.TracingSpoolingManager.lambda$openInputStream$2(TracingSpoolingManager.java:82)\n\tat io.trino.server.protocol.spooling.TracingSpoolingManager.withTracing(TracingSpoolingManager.java:135)\n\tat io.trino.server.protocol.spooling.TracingSpoolingManager.openInputStream(TracingSpoolingManager.java:82)\n\tat io.trino.server.protocol.spooling.SpoolingManagerBridge.openInputStream(SpoolingManagerBridge.java:83)\n\tat io.trino.server.protocol.spooling.CoordinatorSegmentResource.download(CoordinatorSegmentResource.java:74)\n\tat java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:103)\n\tat java.base/java.lang.reflect.Method.invoke(Method.java:580)\n\tat org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:52)\n\tat org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:146)\n\tat org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:189)\n\tat org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:176)\n\tat org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:93)\n\tat org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:478)\n\tat org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:400)\n\tat org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:81)\n\tat org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:274)\n\tat org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)\n\tat org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)\n\tat org.glassfish.jersey.internal.Errors.process(Errors.java:292)\n\tat org.glassfish.jersey.internal.Errors.process(Errors.java:274)\n\tat org.glassfish.jersey.internal.Errors.process(Errors.java:244)\n\tat org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:266)\n\tat org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:253)\n\tat org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:696)\n\tat org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:397)\n\tat org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:349)\n\tat org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:358)\n\tat org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:312)\n\tat org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:205)\n\tat org.eclipse.jetty.ee10.servlet.ServletHolder.handle(ServletHolder.java:736)\n\tat org.eclipse.jetty.ee10.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1619)\n\tat io.airlift.http.server.tracing.TracingServletFilter.doFilter(TracingServletFilter.java:115)\n\tat org.eclipse.jetty.ee10.servlet.FilterHolder.doFilter(FilterHolder.java:205)\n\tat org.eclipse.jetty.ee10.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1591)\n\tat org.eclipse.jetty.ee10.servlet.ServletHandler$MappedServlet.handle(ServletHandler.java:1552)\n\tat org.eclipse.jetty.ee10.servlet.ServletChannel.dispatch(ServletChannel.java:819)\n\tat org.eclipse.jetty.ee10.servlet.ServletChannel.handle(ServletChannel.java:436)\n\tat org.eclipse.jetty.ee10.servlet.ServletHandler.handle(ServletHandler.java:469)\n\tat org.eclipse.jetty.server.handler.gzip.GzipHandler.handle(GzipHandler.java:611)\n\tat org.eclipse.jetty.server.handler.ContextHandler.handle(ContextHandler.java:1060)\n\tat org.eclipse.jetty.server.Handler$Wrapper.handle(Handler.java:740)\n\tat org.eclipse.jetty.server.handler.EventsHandler.handle(EventsHandler.java:81)\n\tat org.eclipse.jetty.server.Server.handle(Server.java:182)\n\tat org.eclipse.jetty.server.internal.HttpChannelState$HandlerInvoker.run(HttpChannelState.java:662)\n\tat org.eclipse.jetty.server.internal.HttpConnection.onFillable(HttpConnection.java:418)\n\tat org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:322)\n\tat org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:99)\n\tat org.eclipse.jetty.io.ssl.SslConnection$1.run(SslConnection.java:136)\n\tat org.eclipse.jetty.util.thread.MonitoredQueuedThreadPool$1.run(MonitoredQueuedThreadPool.java:73)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:979)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1209)\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1164)\n\tat java.base/java.lang.Thread.run(Thread.java:1575)\n'

How might I make the gateway to use the initial cluster it used for handling the query, to also handle the spooled segments?

JustinObanor avatar Apr 24 '25 11:04 JustinObanor