Instances not reconnecting after Jenkins restart
Plugin version is 4.2.0. If I delete the instances the plugin will delete, create, and then connect again just fine. Nothing in the agent log in UI. The following error keeps coming across in Jenkins logs after restart:
2020-01-08 00:51:13.014+0000 [id=79] WARNING c.g.c.g.p.p.client.ComputeClient#lambda$waitForOperationCompletion$11: Error retrieving operation.
com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
{
"code" : 404,
"errors" : [ {
"domain" : "global",
"message" : "The resource 'projects/staging-af5b7922/zones/us-east1-b/operations/operation-1576034201469-599650ebb8354-a9b61b22-436cbfa9' was not found",
"reason" : "notFound"
} ],
"message" : "The resource 'projects/staging-af5b7922/zones/us-east1-b/operations/operation-1576034201469-599650ebb8354-a9b61b22-436cbfa9' was not found"
}
at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:150)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:321)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1056)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
at com.google.cloud.graphite.platforms.plugin.client.ComputeWrapper.getZoneOperation(ComputeWrapper.java:164)
at com.google.cloud.graphite.platforms.plugin.client.ComputeClient.getZoneOperation(ComputeClient.java:584)
at com.google.cloud.graphite.platforms.plugin.client.ComputeClient.lambda$waitForOperationCompletion$11(ComputeClient.java:673)
at org.awaitility.core.CallableCondition$ConditionEvaluationWrapper.eval(CallableCondition.java:100)
at org.awaitility.core.ConditionAwaiter$ConditionPoller.call(ConditionAwaiter.java:201)
at org.awaitility.core.ConditionAwaiter$ConditionPoller.call(ConditionAwaiter.java:188)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2020-01-08 00:51:13.602+0000 [id=80] WARNING c.g.c.g.p.p.client.ComputeClient#lambda$waitForOperationCompletion$11: Error retrieving operation.
com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found
{
"code" : 404,
"errors" : [ {
"domain" : "global",
"message" : "The resource 'projects/staging-af5b7922/zones/us-east1-b/operations/operation-1576033832104-59964f8b77299-3e40225f-37baaed2' was not found",
"reason" : "notFound"
} ],
"message" : "The resource 'projects/staging-af5b7922/zones/us-east1-b/operations/operation-1576033832104-59964f8b77299-3e40225f-37baaed2' was not found"
}
at com.google.api.client.googleapis.json.GoogleJsonResponseException.from(GoogleJsonResponseException.java:150)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:113)
at com.google.api.client.googleapis.services.json.AbstractGoogleJsonClientRequest.newExceptionOnError(AbstractGoogleJsonClientRequest.java:40)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest$1.interceptResponse(AbstractGoogleClientRequest.java:321)
at com.google.api.client.http.HttpRequest.execute(HttpRequest.java:1056)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:419)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.executeUnparsed(AbstractGoogleClientRequest.java:352)
at com.google.api.client.googleapis.services.AbstractGoogleClientRequest.execute(AbstractGoogleClientRequest.java:469)
at com.google.cloud.graphite.platforms.plugin.client.ComputeWrapper.getZoneOperation(ComputeWrapper.java:164)
at com.google.cloud.graphite.platforms.plugin.client.ComputeClient.getZoneOperation(ComputeClient.java:584)
at com.google.cloud.graphite.platforms.plugin.client.ComputeClient.lambda$waitForOperationCompletion$11(ComputeClient.java:673)
at org.awaitility.core.CallableCondition$ConditionEvaluationWrapper.eval(CallableCondition.java:100)
at org.awaitility.core.ConditionAwaiter$ConditionPoller.call(ConditionAwaiter.java:201)
at org.awaitility.core.ConditionAwaiter$ConditionPoller.call(ConditionAwaiter.java:188)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
These operations don't exist in the GCP UI. Not sure what operations it is looking for. The timestamps in the ID are for about a month back it looks like.
Let me know what other debug information would be useful.
Hi @andyshinn, I am using Google Compute Engine Plugin version is 4.2.0 as well. I have the same issue, everything seems to be functioning ok, but my logs are getting clogged w/ the following error every 4-5 seconds:
25-Mar-2020 13:53:26.662 WARNING [awaitility-thread] com.google.cloud.graphite.platforms.plugin.client.ComputeClient.lambda$waitForOperationCompletion$11 Error retrieving operation. com.google.api.client.googleapis.json.GoogleJsonResponseException: 404 Not Found { "code" : 404, "errors" : [ { "domain" : "global", "message" : "The resource 'projects/moki-devops/zones/us-central1-f/operations/operation-1583871788596-5a085e3543ef1-e9306505-c894ba66' was not found", "reason" : "notFound" } ], "message" : "The resource 'projects/moki-devops/zones/us-central1-f/operations/operation-1583871788596-5a085e3543ef1-e9306505-c894ba66' was not found" }
Has anyone found a solution for this?
any update regarding this issue?
From https://cloud.google.com/compute/docs/instances/viewing-compute-operations#operation-retention-period:
While querying operations, keep in mind that completed operations are automatically removed from the database after a certain period. Compute Engine retains completed operations for at least the minimum retention period of 1 hour, and up to the maximum retention period of 14 days. Although projects often observe a retention period for completed operations that is longer than the minimum of 1 hour, depending on the additional retention period is not recommended.
That sounds like it should be pointless to check on the operation status after an hour. I believe you will see this when a node has been failing to connect for more than an hour and until the Compute Engine operations has been rotated. Then you need to clean up the node manually....
Hopefully https://github.com/jenkinsci/google-compute-engine-plugin/pull/489 should eventually prevent this.