infrastructure icon indicating copy to clipboard operation
infrastructure copied to clipboard

System unavailable: Various linux aarch64 machines

Open adamfarley opened this issue 2 years ago • 5 comments

The other two machines see the Java agent die quickly after re-enabling. Error code 126 (command found, but failed anyway).

One theory is that these machines are out of / low on memory. Can they be restarted please?

If this does not fix the issue, perhaps they should be re-initialised from scratch (re-provisioned/re-ansibled).

adamfarley avatar Jun 12 '23 13:06 adamfarley

These 2 are back online following an issue whereby the dockerhost became unreachable.

test-docker-centos8-armv8-1
test-docker-debian11-armv8-1

steelhead31 avatar Jun 15 '23 17:06 steelhead31

I think the host key for these 2 servers has changed, and so Jenkins cant connect to them..
test-alibaba-ubuntu1804-armv8-1
test-alibaba-ubuntu1804-armv8-2

I don't have permissions to correct this.

steelhead31 avatar Jun 15 '23 17:06 steelhead31

@Haroon-Khel do you have access or any contact for these 2 alibaba machines...?

steelhead31 avatar Jul 14 '23 12:07 steelhead31

@Haroon-Khel Did you hear back from Alibaba about the machines they're hosting for us?

sxa avatar Oct 31 '23 15:10 sxa

I've removed the alibaba machines from jenkins. They can be added again if we obtain replacements.

sxa avatar Apr 05 '24 10:04 sxa

Monitoring in place, closing so new issues can be raised as required.

steelhead31 avatar Oct 01 '24 14:10 steelhead31