testcontainers-java icon indicating copy to clipboard operation
testcontainers-java copied to clipboard

[Bug]: trino reports ready before the engine is fully started

Open hgschmie opened this issue 2 years ago • 2 comments

Module

Trino

Testcontainers version

1.17.6

Using the latest Testcontainers version?

Yes

Host OS

MacOS

Host Arch

x86

Docker version

Docker version 20.10.21, build baeda1f

What happened?

Trino container starts up and reports as "ready". Running a query fails with "No nodes available to run query". Waiting a few seconds before running the query allows it to succeed.

-> The trino container reports "ready" before it is actually ready.

The log below shows that the query that testcontainers uses to check that the engine is ready actually fails with the same error (No nodes available to run query) but testcontainers still contains is ready to use

Relevant log output

[main] INFO org.testcontainers.utility.ImageNameSubstitutor - Image name substitution will be performed by: DefaultImageNameSubstitutor (composite of 'ConfigurationFileImageNameSubstitutor' and 'PrefixingImageNameSubstitutor')
[main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy - Loaded org.testcontainers.dockerclient.UnixSocketClientProviderStrategy from ~/.testcontainers.properties, will try it first
[main] INFO org.testcontainers.dockerclient.DockerClientProviderStrategy - Found Docker environment with local Unix socket (unix:///var/run/docker.sock)
[main] INFO org.testcontainers.DockerClientFactory - Docker host IP address is localhost
[main] INFO org.testcontainers.DockerClientFactory - Connected to docker: 
  Server Version: 20.10.21
  API Version: 1.41
  Operating System: Docker Desktop
  Total Memory: 7959 MB
[main] INFO 🐳 [testcontainers/ryuk:0.3.4] - Creating container for image: testcontainers/ryuk:0.3.4
[main] INFO 🐳 [testcontainers/ryuk:0.3.4] - Container testcontainers/ryuk:0.3.4 is starting: d6427b5811ee6d435d3ca898da5203afa7e252398b1948e33140b1d3e836acec
[main] INFO 🐳 [testcontainers/ryuk:0.3.4] - Container testcontainers/ryuk:0.3.4 started in PT0.801393S
[main] INFO org.testcontainers.utility.RyukResourceReaper - Ryuk started - will monitor and terminate Testcontainers containers on JVM exit
[main] INFO org.testcontainers.DockerClientFactory - Checking the system...
[main] INFO org.testcontainers.DockerClientFactory - ✔︎ Docker server version should be at least 1.6.0
[main] INFO 🐳 [trinodb/trino:latest] - Creating container for image: trinodb/trino:latest
[main] INFO 🐳 [trinodb/trino:latest] - Container trinodb/trino:latest is starting: 9a5c3abce4ca5728778fdc700fd585f158c1c8f1ad5f8b94648ee4ca57dd7d80
[main] INFO 🐳 [trinodb/trino:latest] - Waiting for database connection to become available at jdbc:trino://localhost:49864/ using query 'SELECT count(*) FROM tpch.tiny.nation'
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: + launcher_opts=(--etc-dir /etc/trino)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: + grep -s -q node.id /etc/trino/node.properties
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: + launcher_opts+=("-Dnode.id=${HOSTNAME}")
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: + exec /usr/lib/trino/bin/launcher run --etc-dir /etc/trino -Dnode.id=9a5c3abce4ca
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDOUT: # WARNING: Unable to attach Serviceability Agent. You can try again with escalated privileges. Two options: a) use -Djol.tryWithSudo=true to try with sudo; b) echo 0 | sudo tee /proc/sys/kernel/yama/ptrace_scope
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: Dec 16, 2022 6:56:14 PM io.airlift.log.Logger info
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: INFO: Java version: 17.0.5
[...]
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.113Z	INFO	main	io.airlift.bootstrap.LifeCycleManager	Life cycle starting...
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.114Z	INFO	main	io.airlift.bootstrap.LifeCycleManager	Life cycle started
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.115Z	INFO	main	io.trino.connector.StaticCatalogManager	-- Added catalog jmx using connector jmx --
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.115Z	INFO	main	io.trino.connector.StaticCatalogManager	-- Loading catalog memory --
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.186Z	INFO	dispatcher-query-3	io.trino.event.QueryMonitor	TIMELINE: Query 20221216_185626_00039_cuskt :: FAILED (SERVER_STARTING_UP) :: elapsed 0ms :: planning 0ms :: waiting 0ms :: scheduling 0ms :: running 0ms :: finishing 0ms :: begin 2022-12-16T18:56:26.186Z :: end 2022-12-16T18:56:26.186Z
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.306Z	INFO	dispatcher-query-1	io.trino.event.QueryMonitor	TIMELINE: Query 20221216_185626_00040_cuskt :: FAILED (SERVER_STARTING_UP) :: elapsed 0ms :: planning 0ms :: waiting 0ms :: scheduling 0ms :: running 0ms :: finishing 0ms :: begin 2022-12-16T18:56:26.306Z :: end 2022-12-16T18:56:26.306Z
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.382Z	INFO	main	Bootstrap	PROPERTY                              DEFAULT  RUNTIME  DESCRIPTION
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.383Z	INFO	main	Bootstrap	memory.enable-lazy-dynamic-filtering  true     true
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.383Z	INFO	main	Bootstrap	memory.max-data-per-node              128MB    128MB
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.383Z	INFO	main	Bootstrap	memory.splits-per-node                2        2
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.429Z	INFO	dispatcher-query-3	io.trino.event.QueryMonitor	TIMELINE: Query 20221216_185626_00041_cuskt :: FAILED (SERVER_STARTING_UP) :: elapsed 0ms :: planning 0ms :: waiting 0ms :: scheduling 0ms :: running 0ms :: finishing 0ms :: begin 2022-12-16T18:56:26.429Z :: end 2022-12-16T18:56:26.429Z
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.482Z	INFO	main	io.airlift.bootstrap.LifeCycleManager	Life cycle starting...
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.483Z	INFO	main	io.airlift.bootstrap.LifeCycleManager	Life cycle started
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.483Z	INFO	main	io.trino.connector.StaticCatalogManager	-- Added catalog memory using connector memory --
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.483Z	INFO	main	io.trino.connector.StaticCatalogManager	-- Loading catalog tpch --
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.494Z	INFO	main	io.trino.connector.StaticCatalogManager	-- Added catalog tpch using connector tpch --
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.498Z	INFO	main	io.trino.security.AccessControlManager	Using system access control: default
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:26.531Z	INFO	main	io.trino.server.Server	======== SERVER STARTED ========
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:27.274Z	ERROR	query-execution-6	io.trino.execution.scheduler.PipelinedQueryScheduler	Failure in distributed stage for query 20221216_185626_00042_cuskt
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: io.trino.spi.TrinoException: No nodes available to run query
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 	at io.trino.execution.scheduler.UniformNodeSelector.computeAssignments(UniformNodeSelector.java:178)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 	at io.trino.execution.scheduler.DynamicSplitPlacementPolicy.computeAssignments(DynamicSplitPlacementPolicy.java:41)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 	at io.trino.execution.scheduler.SourcePartitionedScheduler.schedule(SourcePartitionedScheduler.java:292)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 	at io.trino.execution.scheduler.SourcePartitionedScheduler$1.schedule(SourcePartitionedScheduler.java:172)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 	at io.trino.execution.scheduler.PipelinedQueryScheduler$DistributedStagesScheduler.schedule(PipelinedQueryScheduler.java:1230)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 	at java.base/java.lang.Thread.run(Thread.java:833)
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 
[docker-java-stream--1977496177] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2022-12-16T18:56:27.292Z	INFO	dispatcher-query-1	io.trino.event.QueryMonitor	TIMELINE: Query 20221216_185626_00042_cuskt :: FAILED (NO_NODES_AVAILABLE) :: elapsed 696ms :: planning 329ms :: waiting 273ms :: scheduling 367ms :: running 0ms :: finishing 367ms :: begin 2022-12-16T18:56:26.586Z :: end 2022-12-16T18:56:27.282Z
[main] INFO 🐳 [trinodb/trino:latest] - Container is started (JDBC URL: jdbc:trino://localhost:49864/)
[main] INFO 🐳 [trinodb/trino:latest] - Container trinodb/trino:latest started in PT14.723648S

Additional Information

starting the container suffices. Sometimes it does work out, so it is a timing issue.

hgschmie avatar Dec 16 '22 19:12 hgschmie

JFTR, it starts with version 374

eddumelendez avatar Dec 16 '22 22:12 eddumelendez

+1 on this issue

Gingernaut avatar Jan 04 '23 01:01 Gingernaut

At the time, I posted the previous comment tags from 374 to 405 were not working. However, taking a look at this again today seems that they are working fine 🤔. Logs doesn't contain the digest and during my tests didn't store them neither so not sure what happened here.

@hgschmie can you confirm this in your side, please? I've run 20 repeated tests and all green

eddumelendez avatar Jan 10 '23 19:01 eddumelendez

will try.

hgschmie avatar Jan 11 '23 18:01 hgschmie

still breaks for me:

[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2023-01-11T19:07:06.962Z      INFO    main    io.trino.server.Server  ======== SERVER STARTED ========
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: 2023-01-11T19:07:07.995Z      ERROR   query-execution-6       io.trino.execution.scheduler.PipelinedQueryScheduler    Failure in distributed stage for query 20230111_190706_00058_7skfx
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR: io.trino.spi.TrinoException: No nodes available to run query
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at io.trino.execution.scheduler.UniformNodeSelector.computeAssignments(UniformNodeSelector.java:216)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at io.trino.execution.scheduler.DynamicSplitPlacementPolicy.computeAssignments(DynamicSplitPlacementPolicy.java:41)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at io.trino.execution.scheduler.SourcePartitionedScheduler.schedule(SourcePartitionedScheduler.java:292)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at io.trino.execution.scheduler.SourcePartitionedScheduler$1.schedule(SourcePartitionedScheduler.java:172)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at io.trino.execution.scheduler.PipelinedQueryScheduler$DistributedStagesScheduler.schedule(PipelinedQueryScheduler.java:1235)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
[docker-java-stream-1217532399] INFO org.jdbi.v3.testing.junit5.tc.TrinoJdbiTestContainersExtensionTest - STDERR:       at java.base/java.lang.Thread.run(Thread.java:833)

This seems to be an inherent trino problem. The server is ready but has not yet registered any processing nodes.

It is harder to reproduce in 405 (probably 10 runs until I got the error) but not gone.

hgschmie avatar Jan 11 '23 19:01 hgschmie

I'm hitting the same issue.

Created the test that fails pretty often on my machine (Intel Mac). Please, have a look here: https://github.com/testcontainers/testcontainers-java/pull/6563/files

I think I understand the issue. The query code we use for checking that container is ready is:

boolean testQuerySucceeded = statement.execute(this.getTestQueryString());
if (testQuerySucceeded) {
  logger().info("Container is started (JDBC URL: {})", this.getJdbcUrl());
  return;
}

Unfortunately, Trino enters some state when it returns from the execute method but it fails with No nodes available to run query when trying to fetch the data from the returned ResultSet. Thus, our check succeeds but Trino is not ready to serve queries that return data. In some other tests I successfully create tables and insert data, but fai later on querying it.

@findepi I see you contributed some code to TrinoContainer and know Trino well. What are your thoughts?

Solutions that come to my mind include:

  1. In TrinoContainer overriding the ready check code to include checking for real data retrieval.
  2. Changing it globally on JDBCContainer level.

WDYT?

szymonm avatar Feb 02 '23 09:02 szymonm

@findepi I see you contributed some code to TrinoContainer and know Trino well. What are your thoughts?

thanks @szymonm for the ping. yes, i think i contributed that class

the container defines the readiness check

    public String getTestQueryString() {
        return "SELECT count(*) FROM tpch.tiny.nation";

i think that when I was working on the container, this was sufficient. I think it's not very practical for Trino in a single-node setup to accept a query and fail with No nodes available to run query, so I would treat it as a Trino problem, that we may or may not choose to workaround within testcontainers.

@szymonm @hgschmie if you're able to reproduce the problem using trinodb/trino docker image directly (without testcontainers library), I would recommend filing the bug report with the Trino project.

findepi avatar Feb 02 '23 16:02 findepi