azure-vm-agents-plugin
azure-vm-agents-plugin copied to clipboard
Windows VM Conflict Error, continuation
Jenkins and plugins versions report
Environment
Jenkins: 2.452.2
OS: Linux - 6.1.124-134.200.amzn2023.x86_64
Java: 17.0.13 - Amazon.com Inc. (OpenJDK 64-Bit Server VM)
---
Office-365-Connector:4.21.1
active-directory:2.35
allure-jenkins-plugin:2.31.1
amazon-ecs:1.49
analysis-model-api:12.3.3
ansible:403.v8d0ca_dcb_b_502
ansicolor:1.0.4
ant:497.v94e7d9fffa_b_9
antisamy-markup-formatter:162.v0e6ec0fcfcf6
apache-httpcomponents-client-4-api:4.5.14-208.v438351942757
apache-httpcomponents-client-5-api:5.3.1-110.v77252fb_d4da_5
artifactory:4.0.8
asm-api:9.7-33.v4d23ef79fcc8
atlassian-bitbucket-server-integration:4.0.0
audit-trail:361.v82cde86c784e
authentication-tokens:1.119.v50285141b_7e1
authorize-project:1.7.2
aws-credentials:231.v08a_59f17d742
aws-java-sdk:1.12.753-463.v071a_97315959
aws-java-sdk-api-gateway:1.12.753-463.v071a_97315959
aws-java-sdk-autoscaling:1.12.753-463.v071a_97315959
aws-java-sdk-cloudformation:1.12.753-463.v071a_97315959
aws-java-sdk-cloudfront:1.12.753-463.v071a_97315959
aws-java-sdk-codebuild:1.12.753-463.v071a_97315959
aws-java-sdk-codedeploy:1.12.753-463.v071a_97315959
aws-java-sdk-ec2:1.12.753-463.v071a_97315959
aws-java-sdk-ecr:1.12.753-463.v071a_97315959
aws-java-sdk-ecs:1.12.753-463.v071a_97315959
aws-java-sdk-efs:1.12.753-463.v071a_97315959
aws-java-sdk-elasticbeanstalk:1.12.753-463.v071a_97315959
aws-java-sdk-elasticloadbalancingv2:1.12.753-463.v071a_97315959
aws-java-sdk-iam:1.12.753-463.v071a_97315959
aws-java-sdk-kinesis:1.12.753-463.v071a_97315959
aws-java-sdk-lambda:1.12.753-463.v071a_97315959
aws-java-sdk-logs:1.12.753-463.v071a_97315959
aws-java-sdk-minimal:1.12.753-463.v071a_97315959
aws-java-sdk-organizations:1.12.753-463.v071a_97315959
aws-java-sdk-secretsmanager:1.12.753-463.v071a_97315959
aws-java-sdk-sns:1.12.753-463.v071a_97315959
aws-java-sdk-sqs:1.12.753-463.v071a_97315959
aws-java-sdk-ssm:1.12.753-463.v071a_97315959
aws-lambda:0.5.10
azure-ad:484.v5fd019a_39b_18
azure-cli:0.9
azure-container-agents:253.vd2f5cd5c5040
azure-credentials:312.v0f3973cd1e59
azure-sdk:174.va_89c1df897d2
azure-vm-agents:941.v72b_1cca_b_cd22
bitbucket:241.v6d24a_57f9359
bitbucket-push-and-pull-request:3.0.2
blueocean:1.27.13
blueocean-autofavorite:1.2.5
blueocean-bitbucket-pipeline:1.27.13
blueocean-commons:1.27.13
blueocean-config:1.27.13
blueocean-core-js:1.27.13
blueocean-dashboard:1.27.13
blueocean-display-url:2.4.3
blueocean-events:1.27.13
blueocean-git-pipeline:1.27.13
blueocean-github-pipeline:1.27.13
blueocean-i18n:1.27.13
blueocean-jira:1.27.13
blueocean-jwt:1.27.13
blueocean-personalization:1.27.13
blueocean-pipeline-api-impl:1.27.13
blueocean-pipeline-editor:1.27.13
blueocean-pipeline-scm-api:1.27.13
blueocean-rest:1.27.13
blueocean-rest-impl:1.27.13
blueocean-web:1.27.13
bootstrap5-api:5.3.3-1
bouncycastle-api:2.30.1.78.1-248.ve27176eb_46cb_
branch-api:2.1169.va_f810c56e895
browserstack-integration:1.2.13
build-environment:1.7
build-failure-analyzer:2.5.2
build-timestamp:1.0.3
build-user-vars-plugin:166.v52976843b_435
build-with-parameters:76.v9382db_f78962
caffeine-api:3.1.8-133.v17b_1ff2e0599
checks-api:2.2.0
chromedriver:1.2
cloud-stats:336.v788e4055508b_
cloudbees-bitbucket-branch-source:888.v8e6d479a_1730
cloudbees-folder:6.928.v7c780211d66e
cobertura:1.17
code-coverage-api:4.99.0
codedeploy:1.23
command-launcher:107.v773860566e2e
commons-compress-api:1.26.1-2
commons-httpclient3-api:3.1-3
commons-lang3-api:3.14.0-76.vda_5591261cfe
commons-text-api:1.12.0-119.v73ef73f2345d
conditional-buildstep:1.4.3
config-file-provider:973.vb_a_80ecb_9a_4d0
configurationslicing:548.ve92d48e66b_f8
copyartifact:746.vd2a_674fb_4f6f
countjobs-viewstabbar:1.0.1
coverage:1.16.1
credentials:1361.v56f5ca_35d21c
credentials-binding:681.vf91669a_32e45
cucumber-reports:5.8.1
custom-tools-plugin:0.8
data-tables-api:2.0.8-1
datadog:7.1.1
dependency-check-jenkins-plugin:5.5.1
dependency-track:5.0.0
display-url-api:2.204.vf6fddd8a_8b_e9
docker-commons:439.va_3cb_0a_6a_fb_29
docker-java-api:3.3.6-90.ve7c5c7535ddd
docker-plugin:1.6.2
docker-workflow:580.vc0c340686b_54
durable-task:555.v6802fe0f0b_82
ec2:1688.v8c07e01d657f
echarts-api:5.5.0-1
eddsa-api:0.3.0-4.v84c6f0f4969e
email-ext:1814.v404722f34263
envinject:2.908.v66a_774b_31d93
envinject-api:1.199.v3ce31253ed13
extended-choice-parameter:382.v5697b_32134e8
extended-read-permission:53.v6499940139e5
extensible-choice-parameter:1.8.1
external-monitor-job:215.v2e88e894db_f8
favorite:2.218.vd60382506538
file-operations:214.v2e7dc7f25757
flatpickr-api:4.6.13-5.v534d8025a_a_59
font-awesome-api:6.5.2-1
forensics-api:2.4.0
git:5.2.2
git-changelog:3.38
git-client:5.0.0
git-parameter:0.9.19
git-server:126.v0d945d8d2b_39
github:1.39.0
github-api:1.318-461.v7a_c09c9fa_d63
github-branch-source:1789.v5b_0c0cea_18c3
gradle:2.12
groovy:457.v99900cb_85593
gson-api:2.11.0-41.v019fcf6125dc
h2-api:11.1.4.199-30.v1c64e772f3a_c
handy-uri-templates-2-api:2.1.8-30.v7e777411b_148
hidden-parameter:237.v4b_df26c7a_f0e
htmlpublisher:1.35
instance-identity:185.v303dc7c645f9
ionicons-api:74.v93d5eb_813d5f
ivy:2.6
jackson2-api:2.17.0-379.v02de8ec9f64c
jacoco:3.3.6
jakarta-activation-api:2.1.3-1
jakarta-mail-api:2.1.3-1
javadoc:243.vb_b_503b_b_45537
javax-activation-api:1.2.0-7
javax-mail-api:1.6.2-10
jaxb:2.3.9-1
jdk-tool:73.vddf737284550
jenkins-design-language:1.27.13
jersey2-api:2.42-147.va_28a_44603b_d5
jira:3.13
jira-trigger:1.0.3
jjwt-api:0.11.5-112.ve82dfb_224b_a_d
jnr-posix-api:3.1.19-2
job-dsl:1.87
joda-time-api:2.12.7-29.v5a_b_e3a_82269a_
jquery:1.12.4-1
jquery3-api:3.7.1-2
jsch:0.2.16-86.v42e010d9484b_
json-api:20240303-41.v94e11e6de726
json-path-api:2.9.0-58.v62e3e85b_a_655
junit:1265.v65b_14fa_f12f0
ldap:725.v3cb_b_711b_1a_ef
lockable-resources:1255.vf48745da_35d0
log-parser:2.3.4
mailer:472.vf7c289a_4b_420
mapdb-api:1.0.9-40.v58107308b_7a_7
mask-passwords:173.v6a_077a_291eb_5
matrix-auth:3.2.2
matrix-project:832.va_66e270d2946
maven-plugin:3.23
mercurial:1260.vdfb_723cdcc81
metrics:4.2.21-451.vd51df8df52ec
mina-sshd-api-common:2.13.1-117.v2f1a_b_66ff91d
mina-sshd-api-core:2.13.1-117.v2f1a_b_66ff91d
monitoring:1.99.0
msbuild:1.33
mstest:1.0.5
node-iterator-api:55.v3b_77d4032326
nodejs:1.6.1
nuget:1.1
nunit:485.ve8a_85357320d
okhttp-api:4.11.0-172.vda_da_1feeb_c6e
packer:1.5
pam-auth:1.11
parameter-separator:166.vd0120849b_386
parameterized-scheduler:277.v61a_4b_a_49a_c5c
parameterized-trigger:806.vf6fff3e28c3e
periodicbackup:2.0
pipeline-build-step:540.vb_e8849e1a_b_d8
pipeline-github-lib:61.v629f2cc41d83
pipeline-graph-analysis:216.vfd8b_ece330ca_
pipeline-groovy-lib:727.ve832a_9244dfa_
pipeline-input-step:495.ve9c153f6067b_
pipeline-maven:1421.v610fa_b_e2d60e
pipeline-maven-api:1421.v610fa_b_e2d60e
pipeline-maven-database:1421.v610fa_b_e2d60e
pipeline-milestone-step:119.vdfdc43fc3b_9a_
pipeline-model-api:2.2205.vc9522a_9d5711
pipeline-model-definition:2.2205.vc9522a_9d5711
pipeline-model-extensions:2.2205.vc9522a_9d5711
pipeline-rest-api:2.34
pipeline-stage-step:312.v8cd10304c27a_
pipeline-stage-tags-metadata:2.2205.vc9522a_9d5711
pipeline-stage-view:2.34
pipeline-utility-steps:2.17.0
plain-credentials:183.va_de8f1dd5a_2b_
plugin-usage-plugin:4.5
plugin-util-api:4.1.0
postgresql-api:42.7.2-40.v76d376d65c77
powershell:2.1
prism-api:1.29.0-15
prisma-cloud-iac-scan:1.3.5
prisma-cloud-jenkins-plugin:31.00.129
pubsub-light:1.18
rebuild:332.va_1ee476d8f6d
remote-file:1.24
run-condition:1.7
saml:4.464.vea_cb_75d7f5e0
scalable-amazon-ecs:1.0
schedule-build:577.v0613c45b_9eef
scm-api:690.vfc8b_54395023
script-security:1341.va_2819b_414686
simple-theme-plugin:191.vcd207ef9dd24
slack:722.vd07f1ea_7ff40
snakeyaml-api:2.2-111.vc6598e30cc65
sonar:2.17.2
splunk-devops:1.10.1
splunk-devops-extend:1.10.1
sse-gateway:1.27
ssh-agent:367.vf9076cd4ee21
ssh-credentials:337.v395d2403ccd4
ssh-slaves:2.973.v0fa_8c0dea_f9f
sshd:3.330.vc866a_8389b_58
stashNotifier:1.492.v1b_33f185ee18
strict-crumb-issuer:2.1.1
structs:338.v848422169819
subversion:1269.v53185011cd9f
terraform:1.0.10
thycotic-credentials:1.0
timestamper:1.27
token-macro:400.v35420b_922dcb_
trilead-api:2.147.vb_73cc728a_32e
uno-choice:2.8.3
variant:60.v7290fc0eb_b_cd
warnings-ng:11.3.0
windows-azure-storage:419.v4046cd70d2e3
workflow-aggregator:600.vb_57cdd26fdd7
workflow-api:1316.v33eb_726c50b_a_
workflow-basic-steps:1058.vcb_fc1e3a_21a_9
workflow-cps:3908.vd6b_b_5a_a_54010
workflow-durable-task-step:1360.v82d13453da_a_f
workflow-job:1400.v7fd111b_ec82f
workflow-multibranch:783.787.v50539468395f
workflow-scm-step:427.v4ca_6512e7df1
workflow-step-api:678.v3ee58b_469476
workflow-support:920.v59f71ce16f04
xray-for-jira-connector:1.2.1
What Operating System are you using (both controller, and any agents involved in the problem)?
Controller: Custom Rhel 8 Agent: Custom Windows 2022 Server
Reproduction steps
- Create a custom windows 2022 server template
- Have it correctly create a node
- wait about 24 hours or the next day and try again
- Conflict Error
Expected Results
Windows node should operate as normal
Actual Results
com.microsoft.azure.vmagent.exceptions.AzureCloudException: Deployment Failed: Microsoft.Compute/virtualMachines:azwin4d3080 - Conflict - com.azure.resourcemanager.resources.models.StatusMessage@692fe522
at com.microsoft.azure.vmagent.exceptions.AzureCloudException.create(AzureCloudException.java:37)
at com.microsoft.azure.vmagent.AzureVMCloud.createProvisionedAgent(AzureVMCloud.java:581)
at com.microsoft.azure.vmagent.AzureVMCloud$2.call(AzureVMCloud.java:843)
at com.microsoft.azure.vmagent.AzureVMCloud$2.call(AzureVMCloud.java:820)
at jenkins.util.ContextResettingExecutorService$2.call(ContextResettingExecutorService.java:46)
at jenkins.security.ImpersonatingExecutorService$2.call(ImpersonatingExecutorService.java:80)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
Anything else?
After about 1-1.5 hours of the above screenshot trying to create a node, it fixes itself:
See https://github.com/jenkinsci/azure-vm-agents-plugin/issues/327#issuecomment-2827749048 for other troubleshooting steps I took. All have failed so far.
Are you interested in contributing a fix?
Not familiar with this type of programming.
Is there anything in the Azure resource groups deployment logs?
Surprisingly no, and there is a huge gap for some odd reason, even though there were failures. Will try to keep an eye out if it does happen again.
Ok needs a minor code change to log this properly:
com.azure.resourcemanager.resources.models.StatusMessage@692fe522
Please upgrade to the latest version, you're using an old one (nearly a year old)
The logging issue was fixed 7 months ago.
Got some errors and they're all the same.
Deployment: "message": "At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/arm-deployment-operations for usage details.",
VM: "message": "The resource write operation failed to complete successfully, because it reached terminal provisioning state 'Failed'."
Usually happens at the start of the day, then it clears itself up after around 10-20 tries.
At the same time, working with team to get plugin up to date.
I would check the activity log in the portal for the resource group might be more info there
Found this on RG Activity logs in create or update VM step
"statusMessage": "{\"status\":\"Failed\",\"error\":{\"code\":\"ResourceOperationFailure\",\"message\":\"The resource operation completed with terminal provisioning state 'Failed'.\",\"details\":[{\"code\":\"OSProvisioningClientError\",\"message\":\"OS Provisioning for VM 'azwina91c10' did not finish in the allotted time. However, the VM guest agent was detected running. This suggests the guest OS has not been properly prepared to be used as a VM image (with CreateOption=FromImage). To resolve this issue, either use the VHD as is with CreateOption=Attach or prepare it properly for use as an image:\\r\\n * Instructions for Windows: https://learn.microsoft.com/azure/virtual-machines/windows/prepare-for-upload-vhd-image\\r\\n * Instructions for Linux: https://learn.microsoft.com/azure/virtual-machines/linux/create-upload-generic \"}]}}",
Also to note, that if I create a VM manually from this custom image, I don't experiencing any issues/errors.
Edit 1: Saw this quick failure (maybe patient zero?)
There are no good activity logs for this other than "Failed/Error". No messages.
If you click on the one that failed you should be able to get a JSON error message on what went wrong
That's full error. Couldn't reproduce another of the faster failure. There was no messages or anything useful. Just listing resources. Here's the output of the ones that go for longer:
{
"code": "DeploymentFailed",
"target": "/subscriptions/a0953841-3e8a-47de-8205-251628d01fee/resourceGroups/gccoemgmt-jenkins-prod-rg-01/providers/Microsoft.Resources/deployments/azwin0505135302435",
"message": "At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/arm-deployment-operations for usage details.",
"details": [
{
"code": "ResourceDeploymentFailure",
"target": "/subscriptions/a0953841-3e8a-47de-8205-251628d01fee/resourceGroups/gccoemgmt-jenkins-prod-rg-01/providers/Microsoft.Compute/virtualMachines/azwin4c0670",
"message": "The resource write operation failed to complete successfully, because it reached terminal provisioning state 'Failed'."
}
]
}
not sure, you may need to raise a support ticket to see if they can see why the VM is failing to provision. Maybe an issue in the region you are deploying to.
After updating our Jenkins dev azure-vm-plugin to 1013.v7a_2a_cd831714, were able to see better error logs.
Instead of the conflict error we got:
java.lang.UnsupportedClassVersionError: hudson/slaves/SlaveComputer$SlaveVersion has been compiled by a more recent version of the Java Runtime (class file version 61.0), this version of the Java Runtime only recognizes class file versions up to 55.0
Updating the version of Java being used on the image to 17, fixed the issue. Will be doing some further testing and report back next week/close ticket.
After additional testing with Jenkins: 2.492.2 azure-vm-agents:1026.v6b_6edb_b_e3fff
The conflict issue/error printout still exists but it happens a lot less often. In the older plugin it did around 5-10 attempts before "fixing itself" and with the latest, even after updating to Java 17, happens around 1-2 times before "fixing itself". "Fixing itself" as in we let the build continue while the plugin keeps attempting to make nodes.
The deployment logs for failures get removed almost immediately so it's hard to grab that information. But as of now, it's been the same as a previous comment.
Seems to be a one off thing. We're using custom images so that could be the culprit where as the old windows 16 images we're using were created and provisioned manually while the windows 22 images is provisioned automatically with packer. Can't really reproduce as somedays everything works perfectly fine and others not.