[CI] SearchTransportTelemetryTests testSearchTransportMetricsQueryThenFetch failing
CI Link
Only failed locally, sorry no build scan
Repro line
':server:test' --tests "org.elasticsearch.search.TelemetryMetrics.SearchTransportTelemetryTests.testSearchTransportMetricsQueryThenFetch" -Dtests.seed=A7DDD2883CF9F74D:5932D02B45F0F45A -Dtests.locale=is-IS -Dtests.timezone=Asia/Dacca -Druntime.java=21
Does it reproduce?
No
Applicable branches
main
Failure history
https://es-delivery-stats.elastic.dev/app/r/s/9tCmJ
Failure excerpt
No failures in CI it seems, but failed on a local run:
org.elasticsearch.search.TelemetryMetrics.SearchTransportTelemetryTests > testSearchTransportMetricsQueryThenFetch FAILED
java.lang.AssertionError: Values should be different. Actual: 0
at __randomizedtesting.SeedInfo.seed([A7DDD2883CF9F74D:5932D02B45F0F45A]:0)
at org.junit.Assert.fail(Assert.java:89)
at org.junit.Assert.failEquals(Assert.java:187)
at org.junit.Assert.assertNotEquals(Assert.java:201)
at org.junit.Assert.assertNotEquals(Assert.java:213)
at org.elasticsearch.search.TelemetryMetrics.SearchTransportTelemetryTests.testSearchTransportMetricsQueryThenFetch(SearchTransportTelemetryTests.java:81)
1> [2024-01-10T07:14:05,007][INFO ][o.e.s.T.SearchTransportTelemetryTests] [testSearchTransportMetricsQueryThenFetch] before test
1> [2024-01-10T07:14:05,013][INFO ][o.e.s.T.SearchTransportTelemetryTests] [testSearchTransportMetricsQueryThenFetch] [SearchTransportTelemetryTests#testSearchTransportMetricsQueryThenFetch]: setting up test
1> [2024-01-10T07:14:05,028][INFO ][o.e.t.InternalTestCluster] [testSearchTransportMetricsQueryThenFetch] Setup InternalTestCluster [TEST-TEST_WORKER_VM=[1142]-CLUSTER_SEED=[-2142127958903691377]-HASH=[B8C2190D1CAC]-cluster] with seed [E245A1F88A1CE78F] using [0] dedicated masters, [1] (data) nodes and [0] coord only nodes (master nodes are [auto-managed])
1> [2024-01-10T07:14:05,035][INFO ][o.e.n.Node ] [testSearchTransportMetricsQueryThenFetch] version[8.13.0-SNAPSHOT], pid[1723656], build[unknown/unknown/unknown], OS[Linux/6.2.0-1020-gcp/amd64], JVM[Oracle Corporation/OpenJDK 64-Bit Server VM/21.0.1/21.0.1+12-29]
1> [2024-01-10T07:14:05,035][INFO ][o.e.n.Node ] [testSearchTransportMetricsQueryThenFetch] JVM home [/home/davidturner/.gradle/jdks/oracle_corporation-21-amd64-linux/jdk-21.0.1], using bundled JDK [false]
1> [2024-01-10T07:14:05,035][INFO ][o.e.n.Node ] [testSearchTransportMetricsQueryThenFetch] JVM arguments [-Des.insecure_network_trace_enabled=true, -Des.scripting.update.ctx_in_params=false, -Des.search.rewrite_sort=true, -Des.transport.cname_in_publish_address=true, -Dgradle.dist.lib=/home/davidturner/.gradle/wrapper/dists/gradle-8.5-all/3zlzzgtsutfj0pbojr50n2l7z/gradle-8.5/lib, -Dgradle.user.home=/home/davidturner/.gradle, -Dgradle.worker.jar=/home/davidturner/.gradle/caches/8.5/workerMain/gradle-worker.jar, -Dio.netty.noKeySetOptimization=true, -Dio.netty.noUnsafe=true, -Dio.netty.recycler.maxCapacityPerThread=0, -Djava.awt.headless=true, -Djava.locale.providers=SPI,COMPAT, -Djava.security.manager=allow, -Djna.nosys=true, -Dorg.gradle.internal.worker.tmpdir=/home/davidturner/src/elasticsearch/server/build/tmp/test/work, -Dorg.gradle.native=false, -Dtests.artifact=server, -Dtests.gradle=true, -Dtests.gradle-loop-iteration=19, -Dtests.logger.level=WARN, -Dtests.security.manager=true, -Dtests.seed=A7DDD2883CF9F74D, -Dtests.task=:server:test, --add-opens=java.base/java.security.cert=ALL-UNNAMED, --add-opens=java.base/java.nio.channels=ALL-UNNAMED, --add-opens=java.base/java.net=ALL-UNNAMED, --add-opens=java.base/javax.net.ssl=ALL-UNNAMED, --add-opens=java.base/java.nio.file=ALL-UNNAMED, --add-opens=java.base/java.time=ALL-UNNAMED, --add-opens=java.management/java.lang.management=ALL-UNNAMED, -XX:+HeapDumpOnOutOfMemoryError, -esa, -XX:HeapDumpPath=/home/davidturner/src/elasticsearch/server/build/heapdump, -Xms512m, -Xmx512m, -Dfile.encoding=UTF-8, -Djava.io.tmpdir=/home/davidturner/src/elasticsearch/server/build/testrun/test/temp, -Duser.country, -Duser.language=en, -Duser.variant, -ea]
1> [2024-01-10T07:14:05,035][INFO ][o.e.n.Node ] [testSearchTransportMetricsQueryThenFetch] Default Locale [is_IS]
1> [2024-01-10T07:14:05,036][WARN ][o.e.n.Node ] [testSearchTransportMetricsQueryThenFetch] version [8.13.0-SNAPSHOT] is a pre-release version of Elasticsearch and is not suitable for production
1> [2024-01-10T07:14:05,036][WARN ][o.e.d.n.Node ] [testSearchTransportMetricsQueryThenFetch] data_stream.dataset="deprecation.elasticsearch" data_stream.namespace="default" data_stream.type="logs" elasticsearch.event.category="settings" event.code="shared-data-path" message="setting [path.shared_data] is deprecated and will be removed in a future release"
1> [2024-01-10T07:14:05,232][INFO ][o.e.e.NodeEnvironment ] [testSearchTransportMetricsQueryThenFetch] using [1] data paths, mounts [[/ (/dev/root)]], net usable_space [224.8gb], net total_space [242.2gb], types [ext4]
1> [2024-01-10T07:14:05,233][INFO ][o.e.e.NodeEnvironment ] [testSearchTransportMetricsQueryThenFetch] heap size [512mb], compressed ordinary object pointers [true]
1> [2024-01-10T07:14:05,236][INFO ][o.e.n.Node ] [testSearchTransportMetricsQueryThenFetch] node name [node_t0], node ID [9pmAQkDpT2-BAiFU_LPorA], cluster name [TEST-TEST_WORKER_VM=[1142]-CLUSTER_SEED=[-2142127958903691377]-HASH=[B8C2190D1CAC]-cluster], roles [master, data_warm, data_content, transform, data_hot, ml, data_frozen, ingest, data_cold, data, remote_cluster_client]
1> [2024-01-10T07:14:05,328][INFO ][o.e.f.FeatureService ] [testSearchTransportMetricsQueryThenFetch] Registered local node features [features_supported]
1> [2024-01-10T07:14:05,350][INFO ][o.e.i.r.RecoverySettings ] [testSearchTransportMetricsQueryThenFetch] using rate limit [40mb] with [default=40mb, read=0b, write=0b, max=0b]
1> [2024-01-10T07:14:05,352][INFO ][o.e.d.DiscoveryModule ] [testSearchTransportMetricsQueryThenFetch] using discovery type [multi-node] and seed hosts providers [settings, file]
1> [2024-01-10T07:14:05,569][INFO ][o.e.n.Node ] [testSearchTransportMetricsQueryThenFetch] initialized
1> [2024-01-10T07:14:05,596][INFO ][o.e.n.Node ] [[test_TEST-TEST_WORKER_VM=[1142]-CLUSTER_SEED=[-2142127958903691377]-HASH=[B8C2190D1CAC]-cluster[T#1]]] starting ...
1> [2024-01-10T07:14:05,676][INFO ][o.e.t.TransportService ] [[test_TEST-TEST_WORKER_VM=[1142]-CLUSTER_SEED=[-2142127958903691377]-HASH=[B8C2190D1CAC]-cluster[T#1]]] publish_address {127.0.0.1:28181}, bound_addresses {[::1]:28181}, {127.0.0.1:28181}
1> [2024-01-10T07:14:05,724][INFO ][o.e.c.c.ClusterBootstrapService] [[test_TEST-TEST_WORKER_VM=[1142]-CLUSTER_SEED=[-2142127958903691377]-HASH=[B8C2190D1CAC]-cluster[T#1]]] this node has not joined a bootstrapped cluster yet; [cluster.initial_master_nodes] is set to [node_t0]
1> [2024-01-10T07:14:05,747][INFO ][o.e.c.c.Coordinator ] [[test_TEST-TEST_WORKER_VM=[1142]-CLUSTER_SEED=[-2142127958903691377]-HASH=[B8C2190D1CAC]-cluster[T#1]]] setting initial configuration to VotingConfiguration{9pmAQkDpT2-BAiFU_LPorA}
1> [2024-01-10T07:14:05,845][INFO ][o.e.c.s.MasterService ] [node_t0] elected-as-master ([1] nodes joined in term 1)[_FINISH_ELECTION_, {node_t0}{9pmAQkDpT2-BAiFU_LPorA}{a2O6zoWDQFiQrNroIsfMsQ}{node_t0}{127.0.0.1}{127.0.0.1:28181}{cdfhilmrstw}{8.13.0}{7000099-8501000} completing election], term: 1, version: 1, delta: master node changed {previous [], current [{node_t0}{9pmAQkDpT2-BAiFU_LPorA}{a2O6zoWDQFiQrNroIsfMsQ}{node_t0}{127.0.0.1}{127.0.0.1:28181}{cdfhilmrstw}{8.13.0}{7000099-8501000}]}
1> [2024-01-10T07:14:05,880][INFO ][o.e.c.c.CoordinationState] [node_t0] cluster UUID set to [I8Rrx0LzSuqGvOAS0lB8Jg]
1> [2024-01-10T07:14:05,900][INFO ][o.e.c.s.ClusterApplierService] [node_t0] master node changed {previous [], current [{node_t0}{9pmAQkDpT2-BAiFU_LPorA}{a2O6zoWDQFiQrNroIsfMsQ}{node_t0}{127.0.0.1}{127.0.0.1:28181}{cdfhilmrstw}{8.13.0}{7000099-8501000}]}, term: 1, version: 1, reason: Publication{term=1, version=1}
1> [2024-01-10T07:14:05,930][INFO ][o.e.c.f.AbstractFileWatchingService] [node_t0] starting file watcher ...
1> [2024-01-10T07:14:05,970][INFO ][o.e.c.c.NodeJoinExecutor ] [node_t0] node-join: [{node_t0}{9pmAQkDpT2-BAiFU_LPorA}{a2O6zoWDQFiQrNroIsfMsQ}{node_t0}{127.0.0.1}{127.0.0.1:28181}{cdfhilmrstw}{8.13.0}{7000099-8501000}] with reason [completing election]
1> [2024-01-10T07:14:05,972][INFO ][o.e.c.f.AbstractFileWatchingService] [[elasticsearch[file-watcher[/home/davidturner/src/elasticsearch/server/build/testrun/test/temp/org.elasticsearch.search.TelemetryMetrics.SearchTransportTelemetryTests_A7DDD2883CF9F74D-001/tempDir-002/config/operator/settings.json]]]] file settings service up and running [tid=1396]
1> [2024-01-10T07:14:05,998][INFO ][o.e.n.Node ] [[test_TEST-TEST_WORKER_VM=[1142]-CLUSTER_SEED=[-2142127958903691377]-HASH=[B8C2190D1CAC]-cluster[T#1]]] started {node_t0}{9pmAQkDpT2-BAiFU_LPorA}{a2O6zoWDQFiQrNroIsfMsQ}{node_t0}{127.0.0.1}{127.0.0.1:28181}{cdfhilmrstw}{8.13.0}{7000099-8501000}
1> [2024-01-10T07:14:06,024][INFO ][o.e.g.GatewayService ] [node_t0] recovered [0] indices into cluster_state
1> [2024-01-10T07:14:06,040][INFO ][o.e.h.n.s.HealthNodeTaskExecutor] [node_t0] Node [{node_t0}{9pmAQkDpT2-BAiFU_LPorA}] is selected as the current health node.
1> [2024-01-10T07:14:06,159][INFO ][o.e.c.m.MetadataIndexTemplateService] [node_t0] adding template [random_index_template] for index patterns [*]
1> [2024-01-10T07:14:06,181][INFO ][o.e.s.T.SearchTransportTelemetryTests] [testSearchTransportMetricsQueryThenFetch] [SearchTransportTelemetryTests#testSearchTransportMetricsQueryThenFetch]: all set up test
1> [2024-01-10T07:14:06,188][INFO ][o.e.c.m.MetadataCreateIndexService] [node_t0] [test2] creating index, cause [api], templates [random_index_template], shards [3]/[0]
1> [2024-01-10T07:14:06,478][INFO ][o.e.c.r.a.AllocationService] [node_t0] current.health="GREEN" message="Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[test2][0]]])." previous.health="YELLOW" reason="shards started [[test2][0]]"
1> [2024-01-10T07:14:06,502][INFO ][o.e.s.T.SearchTransportTelemetryTests] [testSearchTransportMetricsQueryThenFetch] Index [1] docs async: [true] bulk: [false]
1> [2024-01-10T07:14:06,613][INFO ][o.e.c.m.MetadataMappingService] [node_t0] [test2/LloPpu4jSU6HaBIMKMpbyA] create_mapping
1> [2024-01-10T07:14:06,994][INFO ][o.e.s.T.SearchTransportTelemetryTests] [testSearchTransportMetricsQueryThenFetch] [SearchTransportTelemetryTests#testSearchTransportMetricsQueryThenFetch]: cleaning up after test
1> [2024-01-10T07:14:07,173][INFO ][o.e.c.m.MetadataDeleteIndexService] [node_t0] [test2/LloPpu4jSU6HaBIMKMpbyA] deleting index
1> [2024-01-10T07:14:07,203][INFO ][o.e.t.s.M.Listener ] [node_t0] [test2][1] start check index
1> [2024-01-10T07:14:07,214][INFO ][o.e.t.s.M.Listener ] [node_t0] [test2][1] end check index
1> [2024-01-10T07:14:07,221][INFO ][o.e.t.s.M.Listener ] [node_t0] [test2][0] start check index
1> [2024-01-10T07:14:07,223][INFO ][o.e.t.s.M.Listener ] [node_t0] [test2][0] end check index
1> [2024-01-10T07:14:07,227][INFO ][o.e.t.s.M.Listener ] [node_t0] [test2][2] start check index
1> [2024-01-10T07:14:07,230][INFO ][o.e.t.s.M.Listener ] [node_t0] [test2][2] end check index
1> [2024-01-10T07:14:07,246][INFO ][o.e.c.m.MetadataIndexTemplateService] [node_t0] removing template [random_index_template]
1> [2024-01-10T07:14:07,261][INFO ][o.e.n.Node ] [testSearchTransportMetricsQueryThenFetch] stopping ...
1> [2024-01-10T07:14:07,264][INFO ][o.e.c.f.AbstractFileWatchingService] [[elasticsearch[file-watcher[/home/davidturner/src/elasticsearch/server/build/testrun/test/temp/org.elasticsearch.search.TelemetryMetrics.SearchTransportTelemetryTests_A7DDD2883CF9F74D-001/tempDir-002/config/operator/settings.json]]]] shutting down watcher thread
1> [2024-01-10T07:14:07,270][INFO ][o.e.c.f.AbstractFileWatchingService] [testSearchTransportMetricsQueryThenFetch] watcher service stopped
1> [2024-01-10T07:14:07,282][INFO ][o.e.n.Node ] [testSearchTransportMetricsQueryThenFetch] stopped
1> [2024-01-10T07:14:07,282][INFO ][o.e.n.Node ] [testSearchTransportMetricsQueryThenFetch] closing ...
1> [2024-01-10T07:14:07,286][INFO ][o.e.n.Node ] [testSearchTransportMetricsQueryThenFetch] closed
1> [2024-01-10T07:14:07,291][INFO ][o.e.s.T.SearchTransportTelemetryTests] [testSearchTransportMetricsQueryThenFetch] [SearchTransportTelemetryTests#testSearchTransportMetricsQueryThenFetch]: cleaned up after test
1> [2024-01-10T07:14:07,292][INFO ][o.e.s.T.SearchTransportTelemetryTests] [testSearchTransportMetricsQueryThenFetch] after test
Pinging @elastic/es-search (Team:Search)
@JVerwolf looks like most of SearchTransportTelemetryTests are flaky and muted. Mind determining impact and digging in?
Here's a build scan of this failing on main: https://gradle-enterprise.elastic.co/s/c2pzftpx5sxz2/tests/:server:test/org.elasticsearch.search.TelemetryMetrics.SearchTransportTelemetryTests/testSearchTransportMetricsQueryThenFetch.
Another instance on a PR, it seems the original test here was not muted so I've done so in ad47ffbb2d5
Pinging @elastic/es-search-foundations (Team:Search Foundations)
@JVerwolf , please provide a risk assessment on this muted test.
Labeling this as low risk due to it targeting transport telemetry
Agreed, @andreidan. I spent a number of cycles trying to reproduce, but there must be some logic that causes various transport actions to be called differently depending on the random cluster topology (e.g. via an optimization).
IIR, @carlosdelest was adding telemetry that will replace this. When that happens, these can be deleted.
IIR, @carlosdelest was adding telemetry that will replace this. When that happens, these can be deleted.
@JVerwolf correct, these are removed as part of https://github.com/elastic/elasticsearch/pull/113194
Closed by https://github.com/elastic/elasticsearch/pull/113194