Coworker
Coworker copied to clipboard
[Question] Is it possible that sometimes the same delayed_work record is processed more than once?
Description
Is it possible that sometimes the same delayed_work record is processed more than once?
From the testing we've been able to do so far, it seems like this is an intermittent issue. Also, I haven't encountered a scenario where the same record was processed more than twice. I have created a sample Spring Boot project here https://github.com/RMISCG/queue which you can use for testing purposes. The project has a README.md file in the parent directory for more details.
Additional Information
-
Framework Version: Java: 11.0.4 Spring Boot: 2.2.5.RELEASE Gradle: 5.6
-
Platform: macOS Mojave Version 10.14.6
-
Command/Work You Tried to Run:
- cd into the parent directory.
- ./gradlew clean
- ./gradlew build
- ./gradlew bootRun
- Any Output you Recieved (record with state 09 processed twice):
2020-03-03 12:58:35.846 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Starting QueueApplication on ml0179620 with PID 34280 (/Users/rmis/dev/apps-external/queue/build/classes/java/main started by rmis in /Users/rmis/dev/apps-external/queue)
2020-03-03 12:58:35.849 INFO 34280 --- [ main] com.rss.queue.QueueApplication : No active profile set, falling back to default profiles: default
2020-03-03 12:58:36.413 INFO 34280 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Starting...
2020-03-03 12:58:36.993 INFO 34280 --- [ main] com.zaxxer.hikari.HikariDataSource : HikariPool-1 - Start completed.
2020-03-03 12:58:37.499 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 01 and result 01
2020-03-03 12:58:37.610 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 02 and result 02
2020-03-03 12:58:37.722 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 03 and result 03
2020-03-03 12:58:37.834 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 04 and result 04
2020-03-03 12:58:37.949 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 05 and result 05
2020-03-03 12:58:38.060 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 06 and result 06
2020-03-03 12:58:38.172 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 07 and result 07
2020-03-03 12:58:38.282 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 08 and result 08
2020-03-03 12:58:38.393 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 09 and result 09
2020-03-03 12:58:38.504 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 10 and result 10
2020-03-03 12:58:38.616 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 11 and result 11
2020-03-03 12:58:38.729 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 12 and result 12
2020-03-03 12:58:38.840 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 13 and result 13
2020-03-03 12:58:38.951 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 14 and result 14
2020-03-03 12:58:39.065 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 15 and result 15
2020-03-03 12:58:39.177 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 16 and result 16
2020-03-03 12:58:39.361 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 17 and result 17
2020-03-03 12:58:39.478 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 18 and result 18
2020-03-03 12:58:39.588 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 19 and result 19
2020-03-03 12:58:39.700 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 20 and result 20
2020-03-03 12:58:39.813 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 21 and result 21
2020-03-03 12:58:39.926 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 22 and result 22
2020-03-03 12:58:40.038 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 23 and result 23
2020-03-03 12:58:40.151 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 24 and result 24
2020-03-03 12:58:40.262 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 25 and result 25
2020-03-03 12:58:40.373 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 26 and result 26
2020-03-03 12:58:40.485 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 27 and result 27
2020-03-03 12:58:40.599 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 28 and result 28
2020-03-03 12:58:40.712 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 29 and result 29
2020-03-03 12:58:40.832 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 30 and result 30
2020-03-03 12:58:40.947 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 31 and result 31
2020-03-03 12:58:41.061 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 32 and result 32
2020-03-03 12:58:41.172 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 33 and result 33
2020-03-03 12:58:41.283 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 34 and result 34
2020-03-03 12:58:41.395 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 35 and result 35
2020-03-03 12:58:41.506 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 36 and result 36
2020-03-03 12:58:41.620 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 37 and result 37
2020-03-03 12:58:41.737 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 38 and result 38
2020-03-03 12:58:41.849 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 39 and result 39
2020-03-03 12:58:41.963 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 40 and result 40
2020-03-03 12:58:42.075 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 41 and result 41
2020-03-03 12:58:42.186 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 42 and result 42
2020-03-03 12:58:42.299 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 43 and result 43
2020-03-03 12:58:42.411 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 44 and result 44
2020-03-03 12:58:42.523 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 45 and result 45
2020-03-03 12:58:42.634 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 46 and result 46
2020-03-03 12:58:42.746 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 47 and result 47
2020-03-03 12:58:42.865 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 48 and result 48
2020-03-03 12:58:42.975 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 49 and result 49
2020-03-03 12:58:43.088 INFO 34280 --- [ main] com.rss.queue.QueueApplication : Created com.rss.queue.job.EchoJob with state 50 and result 50
2020-03-03 12:58:43.366 INFO 34280 --- [pool-1-thread-1] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 01
2020-03-03 12:58:43.442 INFO 34280 --- [pool-1-thread-2] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 50
2020-03-03 12:58:43.519 INFO 34280 --- [pool-1-thread-3] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 49
2020-03-03 12:58:43.596 INFO 34280 --- [pool-1-thread-4] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 48
2020-03-03 12:58:43.673 INFO 34280 --- [pool-1-thread-5] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 47
2020-03-03 12:58:44.118 INFO 34280 --- [pool-1-thread-1] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 46
2020-03-03 12:58:44.192 INFO 34280 --- [pool-1-thread-2] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 45
2020-03-03 12:58:44.266 INFO 34280 --- [pool-1-thread-3] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 44
2020-03-03 12:58:44.341 INFO 34280 --- [pool-1-thread-4] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 43
2020-03-03 12:58:44.421 INFO 34280 --- [pool-1-thread-5] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 42
2020-03-03 12:58:44.867 INFO 34280 --- [pool-1-thread-1] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 41
2020-03-03 12:58:44.944 INFO 34280 --- [pool-1-thread-2] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 40
2020-03-03 12:58:45.020 INFO 34280 --- [pool-1-thread-3] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 39
2020-03-03 12:58:45.105 INFO 34280 --- [pool-1-thread-4] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 38
2020-03-03 12:58:45.188 INFO 34280 --- [pool-1-thread-5] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 37
2020-03-03 12:58:45.696 INFO 34280 --- [pool-1-thread-1] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 36
2020-03-03 12:58:45.771 INFO 34280 --- [pool-1-thread-2] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 35
2020-03-03 12:58:45.846 INFO 34280 --- [pool-1-thread-3] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 34
2020-03-03 12:58:45.922 INFO 34280 --- [pool-1-thread-4] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 33
2020-03-03 12:58:45.996 INFO 34280 --- [pool-1-thread-5] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 32
2020-03-03 12:58:46.510 INFO 34280 --- [pool-1-thread-1] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 31
2020-03-03 12:58:46.587 INFO 34280 --- [pool-1-thread-2] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 30
2020-03-03 12:58:46.665 INFO 34280 --- [pool-1-thread-3] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 29
2020-03-03 12:58:46.741 INFO 34280 --- [pool-1-thread-4] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 28
2020-03-03 12:58:46.819 INFO 34280 --- [pool-1-thread-5] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 27
2020-03-03 12:58:47.269 INFO 34280 --- [pool-1-thread-1] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 26
2020-03-03 12:58:47.344 INFO 34280 --- [pool-1-thread-2] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 25
2020-03-03 12:58:47.418 INFO 34280 --- [pool-1-thread-3] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 24
2020-03-03 12:58:47.493 INFO 34280 --- [pool-1-thread-4] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 23
2020-03-03 12:58:47.569 INFO 34280 --- [pool-1-thread-5] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 22
2020-03-03 12:58:48.103 INFO 34280 --- [pool-1-thread-1] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 21
2020-03-03 12:58:48.179 INFO 34280 --- [pool-1-thread-2] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 20
2020-03-03 12:58:48.253 INFO 34280 --- [pool-1-thread-3] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 19
2020-03-03 12:58:48.327 INFO 34280 --- [pool-1-thread-4] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 18
2020-03-03 12:58:48.400 INFO 34280 --- [pool-1-thread-5] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 17
2020-03-03 12:58:48.931 INFO 34280 --- [pool-1-thread-1] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 16
2020-03-03 12:58:49.005 INFO 34280 --- [pool-1-thread-2] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 15
2020-03-03 12:58:49.079 INFO 34280 --- [pool-1-thread-3] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 14
2020-03-03 12:58:49.153 INFO 34280 --- [pool-1-thread-4] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 13
2020-03-03 12:58:49.232 INFO 34280 --- [pool-1-thread-5] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 12
2020-03-03 12:58:49.668 INFO 34280 --- [pool-1-thread-1] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 11
2020-03-03 12:58:49.751 INFO 34280 --- [pool-1-thread-2] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 10
2020-03-03 12:58:49.827 INFO 34280 --- [pool-1-thread-3] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 09
2020-03-03 12:58:49.904 INFO 34280 --- [pool-1-thread-4] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 08
2020-03-03 12:58:49.980 INFO 34280 --- [pool-1-thread-5] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 07
2020-03-03 12:58:50.495 INFO 34280 --- [pool-1-thread-1] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 06
2020-03-03 12:58:50.570 INFO 34280 --- [pool-1-thread-2] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 05
2020-03-03 12:58:50.648 INFO 34280 --- [pool-1-thread-3] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 04
2020-03-03 12:58:50.726 INFO 34280 --- [pool-1-thread-4] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 03
2020-03-03 12:58:50.800 INFO 34280 --- [pool-1-thread-5] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 02
2020-03-03 12:58:51.836 INFO 34280 --- [pool-1-thread-1] com.rss.queue.job.EchoJob : Running com.rss.queue.job.EchoJob with state 09
Disconnected from the target VM, address: '127.0.0.1:59884', transport: 'socket'
Process finished with exit code 130 (interrupted by signal 2: SIGINT)
Hey,
It is certainly possible that a job can be worked twice. Although, I don't think this is a case where it should be. The rework should only happen in a failure case where a job failed to report it's status to the DB. I'll take a bit of a deeper look here as to why a job is getting queued multiple times.
I've run it 10 times, and haven't been able to replicate. Could you perhaps turn on the log level filtering off (I realize that'll be a lot of logs), and post there somewhere like gist.github.com so I can view the logs to see why it's being picked up again?
here's a gist snippet: https://gist.github.com/RMISCG/90780e69d8e517d80b794ca3231860b1 you'll notice that record with state 07 was processed twice. search for string "Running com.rss.queue.job.EchoJob with state 07"
also, if you want to do this yourself, show all coworker logs, simply remove the following block from application.yml
:
logging:
level:
io:
kungfury:
coworker: WARN
Thanks for this! I do know how to change the logs, I just can't reproduce it unfortunately :smile:
So the logs actually show that the job gets released back to the pool:
- https://gist.github.com/RMISCG/90780e69d8e517d80b794ca3231860b1#file-queuecoworkerlogs-txt-L226
The only places ReleaseToPool are called:
-
When a node first comes online it releases it's work that it previously had incase a job gets restarted while it was working: https://github.com/SecurityInsanity/Coworker/blob/d9e6aabb6d/src/main/kotlin/io/kungfury/coworker/CoworkerManager.kt#L104
-
If a service-checker is passed, and it reports a node as offline (this can't be called since the default service checker never reports any nodes as offline since it has no way of knowing): https://github.com/SecurityInsanity/Coworker/blob/d9e6aabb6d/src/main/kotlin/io/kungfury/coworker/CoworkerManager.kt#L121
-
If the future that is handling the work is cancelled: https://github.com/SecurityInsanity/Coworker/blob/d9e6aabb6d/src/main/kotlin/io/kungfury/coworker/CoworkerManager.kt#L219
-
If the work is marked as finished without finishWork being called: https://github.com/SecurityInsanity/Coworker/blob/d9e6aabb6d/src/main/kotlin/io/kungfury/coworker/CoworkerManager.kt#L235
-
If the NStrand is at maximum: https://github.com/SecurityInsanity/Coworker/blob/d9e6aabb6d/src/main/kotlin/io/kungfury/coworker/CoworkerManager.kt#L378
My biggest thought is that the future is cancelled. As none of the others make sense (and I can't seem to reproduce). Is there anything you can think of that would kill the thread/runnable? Perhaps using too much memory? Something along those lines?
I'm not really sure why you're unable to reproduce this. Maybe you can:
- increase the number of delayed_work records - increase
NUMBER_OF_JOBS
inQueueApplication
. AND/OR - increase the number of coworker threads - increase
THREADS
inCoworkerConfiguration
.
Hi,
Any updates on this?
Thanks
Hey,
I'm still not able to replicate this on my end. Even after increasing NUMBER_OF_JOBS
&& THREADS
. Though I still think the future is getting cancelled (somehow). If you'd like debug further yourself I recommend adding in some debug lines on the line I pointed out where a job gets released to the pool for a future getting cancelled (as well as the others) to see where exactly the job is getting released back to the pool.