nomad
nomad copied to clipboard
Feature request: detailed allocation GC docs and logging of allocation state before GC
I recently had some allocations fail and later be replaced with new ones. Judging by the client logs below, allocations were GC'd less than 2h after allocation failure. I can't tell by the logs when the Nomad servers garbage collected the allocations.
2017/10/18 21:50:00.152653 [INFO] client: garbage collecting allocation 44ff048f-cb00-3c0b-528b-9aa76f1867cc due to forced collection
2017/10/18 21:50:00.152515 [INFO] client: marking allocation 44ff048f-cb00-3c0b-528b-9aa76f1867cc for GC
2017/10/18 20:42:25.759585 [INFO] client: marking allocation 44ff048f-cb00-3c0b-528b-9aa76f1867cc for GC
2017/10/18 20:42:25.759441 [INFO] client: marking allocation 44ff048f-cb00-3c0b-528b-9aa76f1867cc for GC
By the time I noticed that the allocation had failed, any debugging information was long gone. When googling for documentation about Nomad allocation GC, I have only been able to find the GC API call documentation and various github comments, so I've had to piece together an understanding of how alloc GC works.
Two things I would like to see:
- logging of reason for allocation failure before the allocation is GC'd
- more explicit documentation about GC behaviour & timelines on both the client and server
Let me know what you think!
Good timing! I'm working on the GC as we speak and the logging will be improved.
forced collection is the reason. Did you force a GC via the /v1/system/gc HTTP endpoint. Otherwise it was probably the periodic server initiated GC which unfortunately are only logged at DEBUG level.
Great to hear! Better logging and more explicit documentation on how GC works will be very helpful.
No manual call to /v1/system/gc was made. I was under the impression (from gleaning github comments) that the clients kept allocations around until forced to clean up from lack of disk space, but if the server forces clean up on its own schedule that makes sense. Clearer documentation would definitely help reason about these things!
#3445 includes some logging improvements to the GC but does not directly address the items you list.
Hey @schmichael I'm pretty curious about this too. I'm trying to make Nomad stop being so aggressive about cleaning up old allocations, but not having much luck.
I've tried setting gc_max_allocs to 600, and it seems to have helped a little, but Nomad is still cleaning-up allocations well before I've collected 600. It shouldn't be getting anywhere near 80% disk utilisation either (or 70% inode utilisation), so I'm not sure what else could be going on here? Similar to @margueritepd , this is severely limiting our ability to troubleshoot failing jobs, so I'd like to know how I can tune this to keep troubleshooting material around.
root@ip-10-2-233-187:~# df -h
Filesystem Size Used Avail Use% Mounted on
udev 16G 0 16G 0% /dev
tmpfs 3.1G 308M 2.8G 10% /run
/dev/nvme0n1p1 97G 3.6G 94G 4% /
tmpfs 16G 0 16G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 16G 0 16G 0% /sys/fs/cgroup
/dev/nvme1n1 246G 83M 234G 1% /opt/hashistack
/dev/nvme2n1 246G 7.3G 227G 4% /var/lib/docker
tmpfs 3.1G 0 3.1G 0% /run/user/1000
tmpfs 1.0M 0 1.0M 0% /opt/hashistack/nomad/alloc/ec0bfd07-d229-6713-d282-5bc20c364892/fabio/secrets
overlay 246G 7.3G 227G 4% /var/lib/docker/overlay2/2a0cc2f3ad6228c3b67dc3db31fa96a53d6647df799a6020981f681727218378/merged
shm 64M 0 64M 0% /var/lib/docker/containers/d9ca31436d1d5fa64123a92ebacfb027c7d5b6dd7ef46f4b06819495203e570c/mounts/shm
tmpfs 1.0M 0 1.0M 0% /opt/hashistack/nomad/alloc/29fd1646-3cd1-6b89-2e55-f770841c0d4d/redirect/secrets
overlay 246G 7.3G 227G 4% /var/lib/docker/overlay2/9e5254b71933c04196e2703b03fb235582f7abf3b663274e572a38fe974a5c77/merged
shm 64M 0 64M 0% /var/lib/docker/containers/fa46dc637041402c2da3250f6c6e85ef4fd3403a561254cd5625f32167f366d7/mounts/shm
root@ip-10-2-233-187:~# ls -1 /opt/hashistack/nomad/alloc/ | wc -l
42
root@ip-10-2-233-187:~# grep gc /opt/hashistack/nomad/conf.d/*
/opt/hashistack/nomad/conf.d/client.hcl: gc_interval = "5m"
/opt/hashistack/nomad/conf.d/client.hcl: gc_max_allocs = 600
As for the debug logs, I'm not sure what I'm looking for, but I don't see anything that tells me why it decided to force this collection.
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.003680 [DEBUG] http: Request GET /v1/allocations?prefix=e7a6f05c (925.312µs)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.005695 [DEBUG] http: Request GET /v1/allocation/e7a6f05c-c770-1626-20d0-0db1d2791bed (1.098235ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.034602 [DEBUG] http: Request GET /v1/allocations?prefix=e7a6f05c (1.224081ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.037078 [DEBUG] http: Request GET /v1/allocation/e7a6f05c-c770-1626-20d0-0db1d2791bed (1.606743ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.069325 [DEBUG] http: Request GET /v1/allocations?prefix=e7a6f05c-c770-1626-20d0-0db1d2791bed (1.189478ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.071825 [DEBUG] http: Request GET /v1/allocation/e7a6f05c-c770-1626-20d0-0db1d2791bed (1.558742ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.103712 [DEBUG] http: Request GET /v1/allocations?prefix=e7a6f05c-c770-1626-20d0-0db1d2791bed (1.958686ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.105823 [DEBUG] http: Request GET /v1/allocation/e7a6f05c-c770-1626-20d0-0db1d2791bed (1.117806ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.137913 [DEBUG] http: Request GET /v1/allocations?prefix=e7a6f05c-c770-1626-20d0-0db1d2791bed (926.151µs)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.140360 [DEBUG] http: Request GET /v1/allocation/e7a6f05c-c770-1626-20d0-0db1d2791bed (1.498045ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.172380 [DEBUG] http: Request GET /v1/allocations?prefix=e7a6f05c-c770-1626-20d0-0db1d2791bed (1.226399ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.174837 [DEBUG] http: Request GET /v1/allocation/e7a6f05c-c770-1626-20d0-0db1d2791bed (1.49045ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.205521 [DEBUG] http: Request GET /v1/allocations?prefix=ec0bfd07 (1.018545ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.207414 [DEBUG] http: Request GET /v1/allocation/ec0bfd07-d229-6713-d282-5bc20c364892 (1.071525ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.236774 [DEBUG] http: Request GET /v1/allocations?prefix=ec0bfd07 (1.11573ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.238764 [DEBUG] http: Request GET /v1/allocation/ec0bfd07-d229-6713-d282-5bc20c364892 (1.210661ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.268730 [DEBUG] http: Request GET /v1/allocations?prefix=ec0bfd07 (987.695µs)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.272626 [DEBUG] http: Request GET /v1/allocation/ec0bfd07-d229-6713-d282-5bc20c364892 (2.998744ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.312814 [DEBUG] http: Request GET /v1/allocations?prefix=ec0bfd07-d229-6713-d282-5bc20c364892 (10.08835ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.315343 [DEBUG] http: Request GET /v1/allocation/ec0bfd07-d229-6713-d282-5bc20c364892 (1.712561ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.343898 [DEBUG] http: Request GET /v1/allocations?prefix=ec0bfd07-d229-6713-d282-5bc20c364892 (859.027µs)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.346204 [DEBUG] http: Request GET /v1/allocation/ec0bfd07-d229-6713-d282-5bc20c364892 (1.441897ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.374149 [DEBUG] http: Request GET /v1/allocations?prefix=ec0bfd07-d229-6713-d282-5bc20c364892 (1.034511ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.376433 [DEBUG] http: Request GET /v1/allocation/ec0bfd07-d229-6713-d282-5bc20c364892 (1.484758ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.406492 [DEBUG] http: Request GET /v1/allocations?prefix=ec0bfd07-d229-6713-d282-5bc20c364892 (1.058281ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.408615 [DEBUG] http: Request GET /v1/allocation/ec0bfd07-d229-6713-d282-5bc20c364892 (1.336813ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.438323 [DEBUG] http: Request GET /v1/allocations?prefix=f2937291 (933.751µs)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.440300 [DEBUG] http: Request GET /v1/allocation/f2937291-f9a0-abea-c368-dacfdbd1453d (1.129102ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.468672 [DEBUG] http: Request GET /v1/allocations?prefix=f2937291 (1.197681ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.470736 [DEBUG] http: Request GET /v1/allocation/f2937291-f9a0-abea-c368-dacfdbd1453d (1.139369ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.502658 [DEBUG] http: Request GET /v1/allocations?prefix=f2937291 (2.846407ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.512273 [DEBUG] http: Request GET /v1/allocation/f2937291-f9a0-abea-c368-dacfdbd1453d (8.654215ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.542595 [DEBUG] http: Request GET /v1/allocations?prefix=f2937291-f9a0-abea-c368-dacfdbd1453d (914.572µs)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.544621 [DEBUG] http: Request GET /v1/allocation/f2937291-f9a0-abea-c368-dacfdbd1453d (1.105978ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.574435 [DEBUG] http: Request GET /v1/allocations?prefix=f2937291-f9a0-abea-c368-dacfdbd1453d (956.429µs)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.577111 [DEBUG] http: Request GET /v1/allocation/f2937291-f9a0-abea-c368-dacfdbd1453d (1.845837ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.608363 [DEBUG] http: Request GET /v1/allocations?prefix=f2937291-f9a0-abea-c368-dacfdbd1453d (1.014081ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.610456 [DEBUG] http: Request GET /v1/allocation/f2937291-f9a0-abea-c368-dacfdbd1453d (1.139116ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.641008 [DEBUG] http: Request GET /v1/allocations?prefix=f2937291-f9a0-abea-c368-dacfdbd1453d (1.278214ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.643608 [DEBUG] http: Request GET /v1/allocation/f2937291-f9a0-abea-c368-dacfdbd1453d (1.706629ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.672222 [DEBUG] http: Request GET /v1/allocations?prefix=fbfdd042 (1.128754ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.674995 [DEBUG] http: Request GET /v1/allocation/fbfdd042-e250-3448-c078-06f59340d9b3 (1.958545ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.703533 [DEBUG] http: Request GET /v1/allocations?prefix=fbfdd042 (1.147387ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.706227 [DEBUG] http: Request GET /v1/allocation/fbfdd042-e250-3448-c078-06f59340d9b3 (1.863357ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.737875 [DEBUG] http: Request GET /v1/allocations?prefix=fbfdd042 (1.182825ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.740536 [DEBUG] http: Request GET /v1/allocation/fbfdd042-e250-3448-c078-06f59340d9b3 (1.881201ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.772625 [DEBUG] http: Request GET /v1/allocations?prefix=fbfdd042-e250-3448-c078-06f59340d9b3 (927.286µs)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.775515 [DEBUG] http: Request GET /v1/allocation/fbfdd042-e250-3448-c078-06f59340d9b3 (2.002904ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.804413 [DEBUG] http: Request GET /v1/allocations?prefix=fbfdd042-e250-3448-c078-06f59340d9b3 (1.185495ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.806603 [DEBUG] http: Request GET /v1/allocation/fbfdd042-e250-3448-c078-06f59340d9b3 (1.387149ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.836496 [DEBUG] http: Request GET /v1/allocations?prefix=fbfdd042-e250-3448-c078-06f59340d9b3 (854.769µs)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.839310 [DEBUG] http: Request GET /v1/allocation/fbfdd042-e250-3448-c078-06f59340d9b3 (1.913087ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.870275 [DEBUG] http: Request GET /v1/allocations?prefix=fbfdd042-e250-3448-c078-06f59340d9b3 (1.111855ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.872544 [DEBUG] http: Request GET /v1/allocation/fbfdd042-e250-3448-c078-06f59340d9b3 (1.436841ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.903173 [DEBUG] http: Request GET /v1/allocations?prefix=fe2645ab (1.198268ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.905163 [DEBUG] http: Request GET /v1/allocation/fe2645ab-f783-1cb8-5152-1ab41a15f15a (1.112841ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.934233 [DEBUG] http: Request GET /v1/allocations?prefix=fe2645ab (1.07565ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.936604 [DEBUG] http: Request GET /v1/allocation/fe2645ab-f783-1cb8-5152-1ab41a15f15a (1.439917ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.966999 [DEBUG] http: Request GET /v1/allocations?prefix=fe2645ab (1.141737ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.969382 [DEBUG] http: Request GET /v1/allocation/fe2645ab-f783-1cb8-5152-1ab41a15f15a (1.514451ms)
Jul 11 04:31:13 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:13.999045 [DEBUG] http: Request GET /v1/allocations?prefix=fe2645ab-f783-1cb8-5152-1ab41a15f15a (1.088026ms)
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.001469 [DEBUG] http: Request GET /v1/allocation/fe2645ab-f783-1cb8-5152-1ab41a15f15a (1.542787ms)
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.031683 [DEBUG] http: Request GET /v1/allocations?prefix=fe2645ab-f783-1cb8-5152-1ab41a15f15a (929.361µs)
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.033685 [DEBUG] http: Request GET /v1/allocation/fe2645ab-f783-1cb8-5152-1ab41a15f15a (1.08413ms)
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.064591 [DEBUG] http: Request GET /v1/allocations?prefix=fe2645ab-f783-1cb8-5152-1ab41a15f15a (1.229859ms)
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.067155 [DEBUG] http: Request GET /v1/allocation/fe2645ab-f783-1cb8-5152-1ab41a15f15a (1.589927ms)
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.097635 [DEBUG] http: Request GET /v1/allocations?prefix=fe2645ab-f783-1cb8-5152-1ab41a15f15a (1.188303ms)
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.099967 [DEBUG] http: Request GET /v1/allocation/fe2645ab-f783-1cb8-5152-1ab41a15f15a (1.464539ms)
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.240560 [DEBUG] client: updated allocations at index 5172654 (total 42) (pulled 14) (filtered 28)
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.240681 [DEBUG] client: allocs: (added 0) (removed 12) (updated 14) (ignore 28)
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.240841 [INFO] client.gc: garbage collecting allocation fbfdd042-e250-3448-c078-06f59340d9b3 due to forced collection
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.240847 [INFO] client.gc: garbage collecting allocation 7b6a3448-544f-7a2b-0a6c-9b1f21256351 due to forced collection
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.240864 [INFO] client.gc: garbage collecting allocation c34340bd-9dbc-ecdc-55b1-4e160eaf0022 due to forced collection
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.240869 [INFO] client.gc: garbage collecting allocation 1ccf20d2-53e8-6b25-d869-cc3b84bf746c due to forced collection
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.240913 [INFO] client.gc: garbage collecting allocation 6be5c421-90d9-6be0-ea92-654a46ce2b98 due to forced collection
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.240940 [INFO] client.gc: garbage collecting allocation 38fe940b-f8fd-35aa-48a6-c784bb485134 due to forced collection
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.240953 [INFO] client.gc: garbage collecting allocation 2cc0e2f5-8385-49d1-2f93-2d69ac9531e3 due to forced collection
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.240965 [DEBUG] client: dropping update to terminal alloc 'f2937291-f9a0-abea-c368-dacfdbd1453d'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.240967 [INFO] client.gc: garbage collecting allocation a3efda3b-f57c-2b1f-2b87-0497ce3a27b6 due to forced collection
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.240975 [INFO] client.gc: garbage collecting allocation a9a144c0-1d30-30c0-0e71-a0d72f3d8e04 due to forced collection
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.240980 [DEBUG] client: dropping update to terminal alloc 'a2c04f69-1824-98f1-6b81-67d1f5b3b96b'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.240997 [INFO] client.gc: garbage collecting allocation 9c522d98-eca1-d450-d002-d74e33dace1b due to forced collection
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.241009 [DEBUG] client: dropping update to terminal alloc 'ba14d8c9-1c8e-b122-e2fb-76b320ef40da'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.241007 [INFO] client.gc: garbage collecting allocation 07af1a30-ac9f-0b89-2342-ef50b5093f73 due to forced collection
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.241019 [DEBUG] client: dropping update to terminal alloc 'e59bdf2f-7961-3f82-5f59-8bc1fbcee7ed'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.241026 [DEBUG] client: dropping update to terminal alloc 'e7a6f05c-c770-1626-20d0-0db1d2791bed'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.241032 [DEBUG] client: dropping update to terminal alloc '8987e2ef-8654-c928-a482-e2892b2065f3'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.241038 [DEBUG] client: dropping update to terminal alloc '69e3c84b-2821-78bd-0aea-3a391adf585a'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.241013 [INFO] client.gc: garbage collecting allocation 36c5e823-7fca-98db-a116-e0c674ff43b0 due to forced collection
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.241018 [DEBUG] client: dropping update to terminal alloc 'bfbbb486-513f-f242-288b-25601770c20b'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.241015 [DEBUG] client: dropping update to terminal alloc '6823974e-8fb3-593c-e7d4-a5a8901e9479'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.241023 [DEBUG] client: dropping update to terminal alloc '9d9fb35d-a163-0a3d-780a-bb5cf78afbb6'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.241025 [DEBUG] client: dropping update to terminal alloc 'b41dc687-655c-1fbc-80f3-bbdff8cc6f76'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.241031 [DEBUG] client: dropping update to terminal alloc 'ae6c1153-6991-d3f7-5c5c-d1fb1efefc9b'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.241026 [DEBUG] client: dropping update to terminal alloc '83dcc035-ac77-8079-939f-6618cb038805'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.241035 [DEBUG] client: dropping update to terminal alloc 'bc9e71bd-0b16-fd50-4e22-dd47c15ffdfb'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.242392 [DEBUG] client: terminating runner for alloc 'fbfdd042-e250-3448-c078-06f59340d9b3'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.242417 [DEBUG] client.gc: garbage collected "fbfdd042-e250-3448-c078-06f59340d9b3"
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.243504 [DEBUG] client: terminating runner for alloc '7b6a3448-544f-7a2b-0a6c-9b1f21256351'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.243535 [DEBUG] client.gc: garbage collected "7b6a3448-544f-7a2b-0a6c-9b1f21256351"
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.244859 [DEBUG] client: terminating runner for alloc 'c34340bd-9dbc-ecdc-55b1-4e160eaf0022'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.244894 [DEBUG] client.gc: garbage collected "c34340bd-9dbc-ecdc-55b1-4e160eaf0022"
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.247478 [DEBUG] client: terminating runner for alloc '6be5c421-90d9-6be0-ea92-654a46ce2b98'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.247512 [DEBUG] client.gc: garbage collected "6be5c421-90d9-6be0-ea92-654a46ce2b98"
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.248741 [DEBUG] client: terminating runner for alloc '1ccf20d2-53e8-6b25-d869-cc3b84bf746c'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.248766 [DEBUG] client.gc: garbage collected "1ccf20d2-53e8-6b25-d869-cc3b84bf746c"
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.249788 [DEBUG] client: terminating runner for alloc '38fe940b-f8fd-35aa-48a6-c784bb485134'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.249818 [DEBUG] client.gc: garbage collected "38fe940b-f8fd-35aa-48a6-c784bb485134"
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.250917 [DEBUG] client: terminating runner for alloc '2cc0e2f5-8385-49d1-2f93-2d69ac9531e3'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.250943 [DEBUG] client.gc: garbage collected "2cc0e2f5-8385-49d1-2f93-2d69ac9531e3"
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.251897 [DEBUG] client: terminating runner for alloc 'a3efda3b-f57c-2b1f-2b87-0497ce3a27b6'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.251923 [DEBUG] client.gc: garbage collected "a3efda3b-f57c-2b1f-2b87-0497ce3a27b6"
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.253027 [DEBUG] client: terminating runner for alloc 'a9a144c0-1d30-30c0-0e71-a0d72f3d8e04'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.253051 [DEBUG] client.gc: garbage collected "a9a144c0-1d30-30c0-0e71-a0d72f3d8e04"
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.254056 [DEBUG] client: terminating runner for alloc '9c522d98-eca1-d450-d002-d74e33dace1b'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.254075 [DEBUG] client.gc: garbage collected "9c522d98-eca1-d450-d002-d74e33dace1b"
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.255124 [DEBUG] client: terminating runner for alloc '07af1a30-ac9f-0b89-2342-ef50b5093f73'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.255167 [DEBUG] client.gc: garbage collected "07af1a30-ac9f-0b89-2342-ef50b5093f73"
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.258133 [DEBUG] client: terminating runner for alloc '36c5e823-7fca-98db-a116-e0c674ff43b0'
Jul 11 04:31:14 ip-10-2-233-187 nomad[15096]: 2018/07/11 04:31:14.258167 [DEBUG] client.gc: garbage collected "36c5e823-7fca-98db-a116-e0c674ff43b0"
Hi @holtwilkins,
Nomad servers periodically GC dead jobs and force GC their allocations on their client nodes. You can configure that interval on servers with the job_gc_threshold parameter. It defaults to 4 hours.
Cool, I'll try bumping that one too. I guess even more reason it would be nice to have better GC logging in Nomad :).