hazelcast-jet icon indicating copy to clipboard operation
hazelcast-jet copied to clipboard

com.hazelcast.jet.pipeline.SourceBuilder_TopologyChangeTest.test_restartJob_nodeShutDown

Open olukas opened this issue 3 years ago • 1 comments

master (commit 4ec1e18dc5b29e5c6d6d316a62dbb72572a8e24b)

Failed on Zulu JDK 11: http://jenkins.hazelcast.com/job/jet-oss-master-zulu-jdk11/289/testReport/com.hazelcast.jet.pipeline/SourceBuilder_TopologyChangeTest/test_restartJob_nodeShutDown/

Stacktrace:

java.lang.AssertionError: No snapshot produced
	at org.junit.Assert.fail(Assert.java:89)
	at org.junit.Assert.assertTrue(Assert.java:42)
	at com.hazelcast.jet.core.JetTestSupport.lambda$waitForFirstSnapshot$3(JetTestSupport.java:257)
	at com.hazelcast.test.HazelcastTestSupport.assertTrueEventually(HazelcastTestSupport.java:1253)
	at com.hazelcast.test.HazelcastTestSupport.assertTrueEventually(HazelcastTestSupport.java:1270)
	at com.hazelcast.jet.core.JetTestSupport.waitForFirstSnapshot(JetTestSupport.java:254)
	at com.hazelcast.jet.pipeline.SourceBuilder_TopologyChangeTest.testTopologyChange(SourceBuilder_TopologyChangeTest.java:114)
	at com.hazelcast.jet.pipeline.SourceBuilder_TopologyChangeTest.test_restartJob_nodeShutDown(SourceBuilder_TopologyChangeTest.java:53)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at com.hazelcast.test.FailOnTimeoutStatement$CallableStatement.call(FailOnTimeoutStatement.java:114)
	at com.hazelcast.test.FailOnTimeoutStatement$CallableStatement.call(FailOnTimeoutStatement.java:106)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.lang.Thread.run(Thread.java:834)

Standard output:

d.thread-57] [c.h.j.i.JobCoordinationService]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Not starting jobs because master is not in safe state.
2020-11-06 03:24:02,237 [DEBUG] [hz.silly_cannon.cached.thread-51] [c.h.j.i.JobRepository]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Cleared snapshot data map __jet.snapshot.053b-2414-3200-0001.0
2020-11-06 03:24:02,238 [ERROR] [hz.silly_cannon.cached.thread-51] [c.h.j.i.o.SnapshotPhase2Operation]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] job 053b-2414-3200-0001, execution 053b-2414-3201-0001 not found for coordinator [127.0.0.1]:5701 for 'SnapshotPhase2Operation'
com.hazelcast.jet.core.TopologyChangedException: job 053b-2414-3200-0001, execution 053b-2414-3201-0001 not found for coordinator [127.0.0.1]:5701 for 'SnapshotPhase2Operation'
	at com.hazelcast.jet.impl.JobExecutionService.assertExecutionContext(JobExecutionService.java:326) ~[classes/:?]
	at com.hazelcast.jet.impl.operation.SnapshotPhase2Operation.doRun(SnapshotPhase2Operation.java:50) ~[classes/:?]
	at com.hazelcast.jet.impl.operation.AsyncOperation.run(AsyncOperation.java:53) ~[classes/:?]
	at com.hazelcast.spi.impl.operationservice.Operation.call(Operation.java:184) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.call(OperationRunnerImpl.java:256) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:237) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:213) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationexecutor.impl.OperationExecutorImpl.run(OperationExecutorImpl.java:411) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationexecutor.impl.OperationExecutorImpl.runOrExecute(OperationExecutorImpl.java:438) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.Invocation.doInvokeLocal(Invocation.java:603) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.Invocation.doInvoke(Invocation.java:588) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.Invocation.invoke0(Invocation.java:547) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.Invocation.invoke(Invocation.java:244) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.InvocationBuilderImpl.invoke(InvocationBuilderImpl.java:59) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.jet.impl.MasterContext.invokeOnParticipant(MasterContext.java:261) ~[classes/:?]
	at com.hazelcast.jet.impl.MasterContext.invokeOnParticipants(MasterContext.java:245) ~[classes/:?]
	at com.hazelcast.jet.impl.MasterSnapshotContext.lambda$onSnapshotPhase1Complete$5(MasterSnapshotContext.java:269) ~[classes/:?]
	at com.hazelcast.jet.impl.JobCoordinationService.lambda$submitToCoordinatorThread$46(JobCoordinationService.java:1028) ~[classes/:?]
	at com.hazelcast.jet.impl.JobCoordinationService.lambda$submitToCoordinatorThread$47(JobCoordinationService.java:1049) ~[classes/:?]
	at com.hazelcast.internal.util.executor.CompletableFutureTask.run(CompletableFutureTask.java:64) [hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.internal.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:217) [hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.lang.Thread.run(Thread.java:834) [?:?]
	at com.hazelcast.internal.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:76) [hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.internal.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:102) [hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
2020-11-06 03:24:02,238 [ WARN] [hz.silly_cannon.cached.thread-51] [c.h.j.i.MasterSnapshotContext]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] SnapshotPhase2Operation for snapshot 1 in job '053b-2414-3200-0001', execution 053b-2414-3201-0001 failed on member: MemberInfo{address=[127.0.0.1]:5701, uuid=f532e641-eea6-44c0-896a-1b3bf5244f37, liteMember=false, memberListJoinVersion=1}=com.hazelcast.jet.core.TopologyChangedException: job 053b-2414-3200-0001, execution 053b-2414-3201-0001 not found for coordinator [127.0.0.1]:5701 for 'SnapshotPhase2Operation'
com.hazelcast.jet.core.TopologyChangedException: job 053b-2414-3200-0001, execution 053b-2414-3201-0001 not found for coordinator [127.0.0.1]:5701 for 'SnapshotPhase2Operation'
	at com.hazelcast.jet.impl.JobExecutionService.assertExecutionContext(JobExecutionService.java:326) ~[classes/:?]
	at com.hazelcast.jet.impl.operation.SnapshotPhase2Operation.doRun(SnapshotPhase2Operation.java:50) ~[classes/:?]
	at com.hazelcast.jet.impl.operation.AsyncOperation.run(AsyncOperation.java:53) ~[classes/:?]
	at com.hazelcast.spi.impl.operationservice.Operation.call(Operation.java:184) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.call(OperationRunnerImpl.java:256) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:237) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.OperationRunnerImpl.run(OperationRunnerImpl.java:213) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationexecutor.impl.OperationExecutorImpl.run(OperationExecutorImpl.java:411) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationexecutor.impl.OperationExecutorImpl.runOrExecute(OperationExecutorImpl.java:438) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.Invocation.doInvokeLocal(Invocation.java:603) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.Invocation.doInvoke(Invocation.java:588) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.Invocation.invoke0(Invocation.java:547) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.Invocation.invoke(Invocation.java:244) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.InvocationBuilderImpl.invoke(InvocationBuilderImpl.java:59) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.jet.impl.MasterContext.invokeOnParticipant(MasterContext.java:261) ~[classes/:?]
	at com.hazelcast.jet.impl.MasterContext.invokeOnParticipants(MasterContext.java:245) ~[classes/:?]
	at com.hazelcast.jet.impl.MasterSnapshotContext.lambda$onSnapshotPhase1Complete$5(MasterSnapshotContext.java:269) ~[classes/:?]
	at com.hazelcast.jet.impl.JobCoordinationService.lambda$submitToCoordinatorThread$46(JobCoordinationService.java:1028) ~[classes/:?]
	at com.hazelcast.jet.impl.JobCoordinationService.lambda$submitToCoordinatorThread$47(JobCoordinationService.java:1049) ~[classes/:?]
	at com.hazelcast.internal.util.executor.CompletableFutureTask.run(CompletableFutureTask.java:64) [hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.internal.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:217) [hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.lang.Thread.run(Thread.java:834) [?:?]
	at com.hazelcast.internal.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:76) [hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.internal.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:102) [hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
2020-11-06 03:24:02,238 [ WARN] [hz.silly_cannon.cached.thread-51] [c.h.j.i.MasterSnapshotContext]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] SnapshotPhase2Operation for snapshot 1 in job '053b-2414-3200-0001', execution 053b-2414-3201-0001 failed on member: MemberInfo{address=[127.0.0.1]:5702, uuid=54517725-aadf-4ea1-a655-dd89f54702f8, liteMember=false, memberListJoinVersion=2}=com.hazelcast.spi.exception.TargetNotMemberException: Not Member! target: [127.0.0.1]:5702, partitionId: -1, operation: com.hazelcast.jet.impl.operation.SnapshotPhase2Operation, service: hz:impl:jetService
com.hazelcast.spi.exception.TargetNotMemberException: Not Member! target: [127.0.0.1]:5702, partitionId: -1, operation: com.hazelcast.jet.impl.operation.SnapshotPhase2Operation, service: hz:impl:jetService
	at com.hazelcast.spi.impl.operationservice.impl.Invocation.initInvocationTarget(Invocation.java:295) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.Invocation.doInvoke(Invocation.java:569) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.Invocation.invoke0(Invocation.java:547) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.Invocation.invoke(Invocation.java:244) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.spi.impl.operationservice.impl.InvocationBuilderImpl.invoke(InvocationBuilderImpl.java:59) ~[hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.jet.impl.MasterContext.invokeOnParticipant(MasterContext.java:261) ~[classes/:?]
	at com.hazelcast.jet.impl.MasterContext.invokeOnParticipants(MasterContext.java:245) ~[classes/:?]
	at com.hazelcast.jet.impl.MasterSnapshotContext.lambda$onSnapshotPhase1Complete$5(MasterSnapshotContext.java:269) ~[classes/:?]
	at com.hazelcast.jet.impl.JobCoordinationService.lambda$submitToCoordinatorThread$46(JobCoordinationService.java:1028) ~[classes/:?]
	at com.hazelcast.jet.impl.JobCoordinationService.lambda$submitToCoordinatorThread$47(JobCoordinationService.java:1049) ~[classes/:?]
	at com.hazelcast.internal.util.executor.CompletableFutureTask.run(CompletableFutureTask.java:64) [hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.internal.util.executor.CachedExecutorServiceDelegate$Worker.run(CachedExecutorServiceDelegate.java:217) [hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
	at java.lang.Thread.run(Thread.java:834) [?:?]
	at com.hazelcast.internal.util.executor.HazelcastManagedThread.executeRun(HazelcastManagedThread.java:76) [hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
	at com.hazelcast.internal.util.executor.HazelcastManagedThread.run(HazelcastManagedThread.java:102) [hazelcast-4.1.1-SNAPSHOT.jar:4.1.1-SNAPSHOT]
2020-11-06 03:24:02,239 [DEBUG] [hz.silly_cannon.cached.thread-51] [c.h.j.i.JobCoordinationService]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] job '053b-2414-3200-0001', execution 053b-2414-3201-0001 snapshot is scheduled in 500ms
2020-11-06 03:24:02,239 [DEBUG] [hz.silly_cannon.cached.thread-51] [c.h.j.i.MasterSnapshotContext]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Snapshot 1 for job '053b-2414-3200-0001', execution 053b-2414-3201-0001 completed in 101ms, status=failure: com.hazelcast.jet.core.TopologyChangedException: job 053b-2414-3200-0001, execution 053b-2414-3201-0001 not found for coordinator [127.0.0.1]:5701 for 'SnapshotPhase1Operation'
2020-11-06 03:24:02,239 [DEBUG] [hz.silly_cannon.cached.thread-51] [c.h.j.i.MasterSnapshotContext]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Not beginning snapshot, job '053b-2414-3200-0001', execution 053b-2414-3201-0001 is not RUNNING, but NOT_RUNNING
2020-11-06 03:24:02,244 [ WARN] [hz.silly_cannon.migration] [c.h.i.p.i.MigrationPlanner]: Logging took 512 ms.
2020-11-06 03:24:02,245 [ INFO] [Thread-478] [c.h.i.i.NodeExtension]: [127.0.0.1]:5702 [jet] [4.4-SNAPSHOT] Destroying node NodeExtension.
2020-11-06 03:24:02,245 [ INFO] [Thread-478] [c.h.i.i.Node]: [127.0.0.1]:5702 [jet] [4.4-SNAPSHOT] Hazelcast Shutdown is completed in 1802 ms.
2020-11-06 03:24:02,246 [ INFO] [Thread-478] [c.h.c.LifecycleService]: [127.0.0.1]:5702 [jet] [4.4-SNAPSHOT] [127.0.0.1]:5702 is SHUTDOWN
2020-11-06 03:24:02,246 [ INFO] [Thread-478] [c.h.c.LifecycleService]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] [127.0.0.1]:5701 is SHUTTING_DOWN
2020-11-06 03:24:02,246 [ WARN] [Thread-478] [c.h.i.i.Node]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Terminating forcefully...
2020-11-06 03:24:02,246 [ INFO] [Thread-478] [c.h.i.i.Node]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Shutting down connection manager...
2020-11-06 03:24:02,259 [ INFO] [hz.silly_cannon.migration] [c.h.i.p.i.MigrationManager]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Partition balance is ok, no need to repartition.
2020-11-06 03:24:02,313 [ INFO] [Thread-478] [c.h.i.i.Node]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Shutting down node engine...
2020-11-06 03:24:02,418 [ INFO] [Thread-478] [c.h.i.i.NodeExtension]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Destroying node NodeExtension.
2020-11-06 03:24:02,418 [ INFO] [Thread-478] [c.h.i.i.Node]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Hazelcast Shutdown is completed in 172 ms.
2020-11-06 03:24:02,419 [ INFO] [Thread-478] [c.h.c.LifecycleService]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] [127.0.0.1]:5701 is SHUTDOWN
BuildInfo right after test_restartJob_nodeShutDown(com.hazelcast.jet.pipeline.SourceBuilder_TopologyChangeTest): BuildInfo{version='4.1.1-SNAPSHOT', build='20201105', buildNumber=20201105, revision=51b6a4b, enterprise=false, serializationVersion=1, jet=JetBuildInfo{version='4.4-SNAPSHOT', build='20201106', revision='4ec1e18'}}
Hiccups measured while running test 'test_restartJob_nodeShutDown(com.hazelcast.jet.pipeline.SourceBuilder_TopologyChangeTest):'
03:23:40, accumulated pauses: 1280 ms, max pause: 264 ms, pauses over 1000 ms: 0
03:23:45, accumulated pauses: 1742 ms, max pause: 729 ms, pauses over 1000 ms: 0
03:23:50, accumulated pauses: 850 ms, max pause: 18 ms, pauses over 1000 ms: 0
03:23:55, accumulated pauses: 1786 ms, max pause: 838 ms, pauses over 1000 ms: 0
03:24:00, accumulated pauses: 994 ms, max pause: 503 ms, pauses over 1000 ms: 0

olukas avatar Nov 06 '20 07:11 olukas

The test failed on the very first assertion without even restarting any nodes. after starting the job, test waits for the first results to arrive to the list and then checks if any snapshots produced. From the logs we can say that snapshot started

SNAPSHOT] Starting snapshot 0 phase 1 for job '053b-2414-3200-0001', execution 053b-2414-3201-0001 on member
Will save 321 to snapshot
2020-11-06 03:23:48,749 [DEBUG] [hz.silly_cannon.jet.cooperative.thread-0] [c.h.j.i.u.AsyncSnapshotWriterImpl]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Stats for src: keys=1, chunks=1, bytes=170
2020-11-06 03:23:48,754 [DEBUG] [hz.silly_cannon.jet.cooperative.thread-1] [c.h.j.i.u.AsyncSnapshotWriterImpl]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Stats for sliding-window-prepare: keys=0, chunks=0, bytes=0

The snapshot interval is 500ms but there is no other snapshot related logs for 10 seconds (the timeout for waitForFirstSnapshot call) and then test fails. Looks like the first snapshot (0) stuck at pahse1.

The pipeline has peek just before sink so we should have lots of Output to ordinal 0 in the logs but we see only 2 lines of logs for the timeout duration (10 seconds).

I'm putting the relevant logs below.

2020-11-06 03:23:46,869 [ INFO] [hz.silly_cannon.cached.thread-13] [c.h.j.i.JobCoordinationService]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Starting job 053b-2414-3200-0001 based on submit request
2020-11-06 03:23:46,903 [ INFO] [hz.silly_cannon.cached.thread-13] [c.h.j.i.MasterJobContext]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Didn't find any snapshot to restore for job '053b-2414-3200-0001', execution 053b-2414-3201-0001
2020-11-06 03:23:46,903 [ INFO] [hz.silly_cannon.cached.thread-13] [c.h.j.i.MasterJobContext]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Start executing job '053b-2414-3200-0001', execution 053b-2414-3201-0001, execution graph in DOT format:
digraph DAG {
	"src" [localParallelism=1];
	"sliding-window-prepare" [localParallelism=72];
	"sliding-window" [localParallelism=1];
	"listSink(result-1acb9824-1220-42b4-bbee-1c6755d0581c)" [localParallelism=1];
	"src" -> "sliding-window-prepare" [queueSize=1024];
	subgraph cluster_0 {
		"sliding-window-prepare" -> "sliding-window" [label="distributed-partitioned", queueSize=1024];
	}
	"sliding-window" -> "listSink(result-1acb9824-1220-42b4-bbee-1c6755d0581c)" [queueSize=1024];
}
HINT: You can use graphviz or http://viz-js.com to visualize the printed graph.
2020-11-06 03:23:46,903 [DEBUG] [hz.silly_cannon.cached.thread-13] [c.h.j.i.MasterJobContext]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Building execution plan for job '053b-2414-3200-0001', execution 053b-2414-3201-0001
2020-11-06 03:23:46,914 [DEBUG] [hz.silly_cannon.cached.thread-13] [c.h.j.i.MasterJobContext]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Built execution plans for job '053b-2414-3200-0001', execution 053b-2414-3201-0001
2020-11-06 03:23:46,917 [DEBUG] [hz.silly_cannon.cached.thread-13] [c.h.j.i.o.InitExecutionOperation]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Initializing execution plan for job 053b-2414-3200-0001, execution 053b-2414-3201-0001 from [127.0.0.1]:5701
2020-11-06 03:23:46,951 [DEBUG] [hz.silly_cannon.cached.thread-14] [c.h.j.i.JobRepository]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Job cleanup took 37ms
2020-11-06 03:23:46,978 [ INFO] [hz.silly_cannon.cached.thread-13] [c.h.j.i.JobExecutionService]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Execution plan for jobId=053b-2414-3200-0001, jobName='053b-2414-3200-0001', executionId=053b-2414-3201-0001 initialized
2020-11-06 03:23:46,996 [DEBUG] [hz.elastic_cannon.generic-operation.thread-15] [c.h.j.i.o.InitExecutionOperation]: [127.0.0.1]:5702 [jet] [4.4-SNAPSHOT] Initializing execution plan for job 053b-2414-3200-0001, execution 053b-2414-3201-0001 from [127.0.0.1]:5701
2020-11-06 03:23:47,154 [ INFO] [hz.elastic_cannon.generic-operation.thread-15] [c.h.j.i.JobExecutionService]: [127.0.0.1]:5702 [jet] [4.4-SNAPSHOT] Execution plan for jobId=053b-2414-3200-0001, jobName='053b-2414-3200-0001', executionId=053b-2414-3201-0001 initialized
2020-11-06 03:23:47,155 [DEBUG] [hz.silly_cannon.cached.thread-13] [c.h.j.i.MasterJobContext]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Init of job '053b-2414-3200-0001', execution 053b-2414-3201-0001 was successful
2020-11-06 03:23:47,155 [DEBUG] [hz.silly_cannon.cached.thread-13] [c.h.j.i.MasterJobContext]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Executing job '053b-2414-3200-0001', execution 053b-2414-3201-0001
2020-11-06 03:23:47,155 [ INFO] [hz.silly_cannon.cached.thread-13] [c.h.j.i.JobExecutionService]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Start execution of job '053b-2414-3200-0001', execution 053b-2414-3201-0001 from coordinator [127.0.0.1]:5701
2020-11-06 03:23:47,662 [DEBUG] [hz.silly_cannon.cached.thread-59] [c.h.j.i.MasterJobContext]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Not scaling up job '053b-2414-3200-0001', execution 053b-2414-3201-0001: not running or already running on all members
2020-11-06 03:23:47,723 [DEBUG] [hz.silly_cannon.cached.thread-13] [c.h.j.i.JobCoordinationService]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] job '053b-2414-3200-0001', execution 053b-2414-3201-0001 snapshot is scheduled in 500ms
2020-11-06 03:23:47,726 [ INFO] [hz.elastic_cannon.generic-operation.thread-16] [c.h.j.i.JobExecutionService]: [127.0.0.1]:5702 [jet] [4.4-SNAPSHOT] Start execution of job '053b-2414-3200-0001', execution 053b-2414-3201-0001 from coordinator [127.0.0.1]:5701
2020-11-06 03:23:48,276 [DEBUG] [hz.silly_cannon.cached.thread-54] [c.h.j.i.MasterSnapshotContext]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Starting snapshot 0 for job '053b-2414-3200-0001', execution 053b-2414-3201-0001, flags: terminal=no,export=no, writing to: null
2020-11-06 03:23:48,276 [DEBUG] [hz.silly_cannon.cached.thread-54] [c.h.j.i.e.ExecutionContext]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Starting snapshot 0 phase 1 for job '053b-2414-3200-0001', execution 053b-2414-3201-0001 on member
2020-11-06 03:23:48,278 [DEBUG] [hz.elastic_cannon.generic-operation.thread-18] [c.h.j.i.e.ExecutionContext]: [127.0.0.1]:5702 [jet] [4.4-SNAPSHOT] Starting snapshot 0 phase 1 for job '053b-2414-3200-0001', execution 053b-2414-3201-0001 on member
Will save 321 to snapshot
2020-11-06 03:23:48,749 [DEBUG] [hz.silly_cannon.jet.cooperative.thread-0] [c.h.j.i.u.AsyncSnapshotWriterImpl]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Stats for src: keys=1, chunks=1, bytes=170
2020-11-06 03:23:48,754 [DEBUG] [hz.silly_cannon.jet.cooperative.thread-1] [c.h.j.i.u.AsyncSnapshotWriterImpl]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Stats for sliding-window-prepare: keys=0, chunks=0, bytes=0
2020-11-06 03:23:48,759 [ INFO] [hz.elastic_cannon.jet.cooperative.thread-5] [c.h.j.i.p.S.sliding-window#1]: [127.0.0.1]:5702 [jet] [4.4-SNAPSHOT] Output to ordinal 0: Watermark{ts=00:00:00.300}
2020-11-06 03:23:48,769 [ INFO] [hz.silly_cannon.jet.cooperative.thread-4] [c.h.j.i.p.S.sliding-window#0]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Output to ordinal 0: WindowResult{start=00:00:00.000, end=00:00:00.100, value='100', isEarly=false} (eventTime=00:00:00.099)
2020-11-06 03:23:48,769 [ INFO] [hz.silly_cannon.jet.cooperative.thread-4] [c.h.j.i.p.S.sliding-window#0]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Output to ordinal 0: WindowResult{start=00:00:00.100, end=00:00:00.200, value='100', isEarly=false} (eventTime=00:00:00.199)
2020-11-06 03:23:48,770 [ INFO] [hz.silly_cannon.jet.cooperative.thread-4] [c.h.j.i.p.S.sliding-window#0]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Output to ordinal 0: WindowResult{start=00:00:00.200, end=00:00:00.300, value='100', isEarly=false} (eventTime=00:00:00.299)
2020-11-06 03:23:48,770 [ INFO] [hz.silly_cannon.jet.cooperative.thread-4] [c.h.j.i.p.S.sliding-window#0]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Output to ordinal 0: Watermark{ts=00:00:00.300}
2020-11-06 03:23:51,975 [DEBUG] [hz.silly_cannon.cached.thread-54] [c.h.j.i.JobRepository]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Job cleanup took 1ms
2020-11-06 03:23:57,039 [DEBUG] [hz.silly_cannon.cached.thread-54] [c.h.j.i.JobRepository]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Job cleanup took 36ms
2020-11-06 03:24:00,414 [ INFO] [Time-limited test] [c.h.j.c.JetTestSupport]: Terminating instanceFactory in JetTestSupport.@After
2020-11-06 03:24:00,441 [ INFO] [Thread-478] [c.h.c.LifecycleService]: [127.0.0.1]:5702 [jet] [4.4-SNAPSHOT] [127.0.0.1]:5702 is SHUTTING_DOWN
2020-11-06 03:24:00,443 [ WARN] [Thread-478] [c.h.i.i.Node]: [127.0.0.1]:5702 [jet] [4.4-SNAPSHOT] Terminating forcefully...
2020-11-06 03:24:00,444 [ INFO] [Thread-478] [c.h.i.i.Node]: [127.0.0.1]:5702 [jet] [4.4-SNAPSHOT] Shutting down connection manager...
2020-11-06 03:24:00,476 [ INFO] [Thread-478] [c.h.t.m.MockServer]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Removed connection to endpoint: [127.0.0.1]:5702, connection: MockConnection{localEndpoint=[127.0.0.1]:5701, remoteEndpoint=[127.0.0.1]:5702, alive=false}
2020-11-06 03:24:00,476 [ INFO] [Thread-478] [c.h.t.m.MockServer]: [127.0.0.1]:5702 [jet] [4.4-SNAPSHOT] Removed connection to endpoint: [127.0.0.1]:5701, connection: MockConnection{localEndpoint=[127.0.0.1]:5702, remoteEndpoint=[127.0.0.1]:5701, alive=false}
2020-11-06 03:24:00,477 [ INFO] [Thread-478] [c.h.i.c.i.MembershipManager]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] Removing Member [127.0.0.1]:5702 - 54517725-aadf-4ea1-a655-dd89f54702f8
2020-11-06 03:24:00,480 [ INFO] [Thread-478] [c.h.i.c.ClusterService]: [127.0.0.1]:5701 [jet] [4.4-SNAPSHOT] 

gurbuzali avatar Nov 17 '20 07:11 gurbuzali