google-cloud-cpp icon indicating copy to clipboard operation
google-cloud-cpp copied to clipboard

Windows build received an interrupt or signal from the OS or another process

Open coryan opened this issue 5 years ago • 40 comments

From time to time we experience build failures ending in:

kbuilder@localhost: Permission denied (publickey,password,keyboard-interactive).
rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.3]
nc: connection failed, SOCKS error 5

https://source.cloud.google.com/results/invocations/9bfa7650-9396-4df2-ab53-c87648297dfe/targets

AFAICT, this is a Kokoro issue I am creating this bug so we have something we can easily point to.

coryan avatar Jun 08 '20 11:06 coryan

https://source.cloud.google.com/results/invocations/6b1b7698-dcd2-4124-b335-6bd35e5fa440

2020-06-12T03:25:04.6844720+00:00 Compressing cache tarball
kokoro_log_reader: Received an interrupt or signal from the OS or another process while sleeping: [Errno 4] Interrupted function call. It's possible that the OS or another process is trying to kill kokoro_log_reader. Attempting to continue anyways.

Break signaled


[ID: 5093278] Build finished after 4635 secs, exit value: 1


Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
kbuilder@localhost: Permission denied (publickey,password,keyboard-interactive).
rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.3]
nc: connection failed, SOCKS error 5
kex_exchange_identification: Connection closed by remote host
rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.3]
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
[20:46:22] Collecting build artifacts from build VM
Build script failed with exit code: 1

coryan avatar Jun 12 '20 12:06 coryan

https://source.cloud.google.com/results/invocations/eb2f1c52-c412-4492-906b-dfe8ed9b28e5

coryan avatar Jul 09 '20 12:07 coryan

https://source.cloud.google.com/results/invocations/232a0387-a1ab-423d-bac7-48c8b135fabe

coryan avatar Jul 19 '20 12:07 coryan

https://source.cloud.google.com/results/invocations/ec16f263-4c0e-412e-8692-4226af1deb63

coryan avatar Jul 19 '20 12:07 coryan

A handful of failures, all around the same time, and all on Windows.

For example, https://source.cloud.google.com/results/invocations/03662455-e923-4602-b2f0-35647283399f

devbww avatar Jul 21 '20 05:07 devbww

Slightly different form:

https://source.cloud.google.com/results/invocations/38d2ef34-e88d-4c61-aeab-6a4ba638e557

But no problems with disk space:

2020-07-30T03:59:41.7622040+00:00 Disk(s) size and space for troubleshooting
DeviceID    : C:
DriveType   : 3
VolumeName  :
FreeSpaceGB : 231.20
Capacity    : 300.00

DeviceID    : T:
DriveType   : 3
VolumeName  : tmpfs
FreeSpaceGB : 99.68
Capacity    : 99.98


2020-07-30T03:59:41.8091136+00:00 Running build script for bazel build

2020-07-30T03:59:42.1215779+00:00 Create bazel user root (C:\b)

2020-07-30T03:59:42.1528397+00:00 Capture Bazel information for troubleshooting
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
Build label: 3.3.1
Build target: bazel-out/x64_windows-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Tue Jun 30 15:13:25 2020 (1593530005)
Build timestamp: 1593530005
Build timestamp as int: 1593530005

2020-07-30T03:59:48.5590340+00:00 Shutting down Bazel server

2020-07-30T03:59:48.6683693+00:00 Using T:\tmp as download directory
Activated service account credentials for: [kokoro-build-uploader@cloud-cpp-integration-tests.iam.gserviceaccount.com]

2020-07-30T03:59:55.7791104+00:00  downloading Bazel cache.
kokoro_log_reader: Received an interrupt or signal from the OS or another process while sleeping: [Errno 4] Interrupted function call. It's possible that the OS or another process is trying to kill kokoro_log_reader. Attempting to continue anyways.table directory)
kokoro_log_reader: Exhausted retries waiting for bash to return an integer exit code, returning exit code 137 instead. This is not a Kokoro error! This usually means: 1) Your build was killed by something in the VM 2) Exited normally, but that the parent shell of the nohup'd build process is IO blocked and could not write out the exit_code_file in 60 seconds. In this case be sure that all processes launched by your build are properly cleaned up before exiting to free up resources and reduce IO contention, or 3) There was a BSOD, the VM rebooted, the builder has reconnected but the process is of course gone.

coryan avatar Jul 30 '20 04:07 coryan

https://source.cloud.google.com/results/invocations/f0151f94-cade-47fe-a698-c8719e5cd58b/targets

Loading: 0 packages loaded
kokoro_log_reader: Received an interrupt or signal from the OS or another process while sleeping: [Errno 4] Interrupted function call. It's possible that the OS or another process is trying to kill kokoro_log_reader. Attempting to continue anyways.

[ID: 7500195] Build finished after 69 secs, exit value: 1


Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
kbuilder@localhost: Permission denied (publickey,password,keyboard-interactive).
rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.3]
nc: connection failed, SOCKS error 5
kex_exchange_identification: Connection closed by remote host
rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.3]
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
[21:04:36] Collecting build artifacts from build VM
Build script failed with exit code: 1
[21:05:24] Archiving artifacts

coryan avatar Aug 22 '20 13:08 coryan

https://source.cloud.google.com/results/invocations/483a9ebe-3feb-4b3b-ba29-ad54afc61a79/targets

kokoro_log_reader: Received an interrupt or signal from the OS or another process while sleeping: [Errno 4] Interrupted function call. It's possible that the OS or another process is trying to kill kokoro_log_reader. Attempting to continue anyways.  pch.cpp


[ID: 2185511] Build finished after 110 secs, exit value: 1


kex_exchange_identification: Connection closed by remote host
rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.3]
nc: connection failed, SOCKS error 5
kex_exchange_identification: Connection closed by remote host
rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.3]
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.

devbww avatar Aug 28 '20 04:08 devbww

https://source.cloud.google.com/results/invocations/8a6ede1e-729a-40d8-8e6f-bd5815a55cb7/targets

[20:49:20] [ID: 2444202] Executing command via SSH:
[20:49:20] 
[20:49:20] git clone https://github.com/googleapis/google-cloud-cpp.git T:/src/github/google-cloud-cpp
[20:49:20] 
[20:49:21] Cloning into '/tmpfs/src/github/google-cloud-cpp'...
[20:52:32] C:/python27/python.exe: can't open file 'T:/kokoro_log_reader.py': [Errno 2] No such file or directory
[20:52:32] 
[20:52:32] 
[20:52:32] [ID: 2444202] Build finished after 192 secs, exit value: 2

devbww avatar Aug 28 '20 04:08 devbww

Seems like a repeat of this bug:

https://source.cloud.google.com/results/invocations/136de31e-5ed3-41f8-bd0a-682f9f2d52b7

[1146/1274] Building CXX object google\cloud\pubsub\CMakeFiles\pubsub_internal_ordering_key_publisher_connection_test.dir\internal\ordering_key_publisher_connection_test.cc.obj
kokoro_log_reader: Received an interrupt or signal from the OS or another process while sleeping: [Errno 4] Interrupted function call. It's possible that the OS or another process is trying to kill kokoro_log_reader. Attempting to continue anyways.[1147/1274] Building CXX object google\cloud\pubsub\CMakeFiles\pubsub_internal_publisher_metadata_test.dir\internal\publisher_metadata_test.cc.obj
FAILED: google/cloud/pubsub/CMakeFiles/pubsub_internal_publisher_metadata_test.dir/internal/publisher_metadata_test.cc.obj
C:\PROGRA~2\MICROS~1\2019\COMMUN~1\VC\Tools\MSVC\1421~1.277\bin\Hostx86\x86\cl.exe  /nologo /TP -DCARES_STATICLIB -D_WIN32_WINNT=0x600 -D__CLANG_SUPPORT_DYN_ANNOTATION__ -I..\..\ -I. -I..\vcpkg\installed\x86-windows-static\include -Iexternal\googleapis /DWIN32 /D_WINDOWS /W3 /GR /EHsc /MT /O2 /Ob2 /DNDEBUG   /W3 /WX /experimental:external /external:W0 /external:anglebrackets /bigobj /showIncludes /Fogoogle\cloud\pubsub\CMakeFiles\pubsub_internal_publisher_metadata_test.dir\internal\publisher_metadata_test.cc.obj /Fdgoogle\cloud\pubsub\CMakeFiles\pubsub_internal_publisher_metadata_test.dir\ /FS -c ..\..\google\cloud\pubsub\internal\publisher_metadata_test.cc


[ID: 3828949] Build finished after 2805 secs, exit value: 1


Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.

coryan avatar Sep 25 '20 12:09 coryan

https://source.cloud.google.com/results/invocations/6546c333-ddec-4c9e-97dd-78175e81beb2

coryan avatar Nov 25 '20 04:11 coryan

https://source.cloud.google.com/results/invocations/7a2ad11a-9a04-4b9e-80f9-38bf374ed411

coryan avatar Nov 25 '20 04:11 coryan

Interestingly, this affected two separate VMs at the same time.

coryan avatar Nov 25 '20 04:11 coryan

https://source.cloud.google.com/results/invocations/7c74f9c3-41d3-4f78-a100-8e237c9007a9/targets

[488/1330] Linking CXX executable google\cloud\bigtable\instance_update_config_test.exe

kokoro_log_reader: Received an interrupt or signal from the OS or another process while sleeping: [Errno 4] Interrupted function call. It's possible that the OS or another process is trying to kill kokoro_log_reader. Attempting to continue anyways.[489/1330] Building CXX object google\cloud\bigtable\CMakeFiles\bigtable_internal_async_longrunning_op_test.dir\internal\async_longrunning_op_test.cc.obj

FAILED: google/cloud/bigtable/CMakeFiles/bigtable_internal_async_longrunning_op_test.dir/internal/async_longrunning_op_test.cc.obj 

C:\PROGRA~2\MICROS~1\2019\COMMUN~1\VC\Tools\MSVC\1421~1.277\bin\Hostx86\x86\cl.exe  /nologo /TP -DCARES_STATICLIB -DGOOGLE_CLOUD_CPP_HAVE_GETRUSAGE=0 -DGOOGLE_CLOUD_CPP_HAVE_RUSAGE_THREAD=0 -D_WIN32_WINNT=0x600 -D__CLANG_SUPPORT_DYN_ANNOTATION__ -I..\..\ -I..\vcpkg\installed\x86-windows-static\include -I. -Iexternal\googleapis /DWIN32 /D_WINDOWS /W3 /GR /EHsc /MT /O2 /Ob2 /DNDEBUG   /W3 /WX /experimental:external /external:W0 /external:anglebrackets /bigobj /showIncludes /Fogoogle\cloud\bigtable\CMakeFiles\bigtable_internal_async_longrunning_op_test.dir\internal\async_longrunning_op_test.cc.obj /Fdgoogle\cloud\bigtable\CMakeFiles\bigtable_internal_async_longrunning_op_test.dir\ /FS -c ..\..\google\cloud\bigtable\internal\async_longrunning_op_test.cc



[ID: 9886498] Build finished after 1109 secs, exit value: 1

devjgm avatar Dec 03 '20 19:12 devjgm

Summary

It looks all the logs mentioned thus far contain an error message similar to this:

kokoro_log_reader: Received an interrupt or signal from the OS or another process while sleeping: [Errno 4] Interrupted function call. It's possible that the OS or another process is trying to kill kokoro_log_reader.

Sometimes the error goes on to say more.

Then the log says something like:

Timeout, server localhost not responding.
rsync: connection unexpectedly closed (60480999 bytes received so far) [receiver]
rsync error: error in rsync protocol data stream (code 12) at io.c(226) [receiver=3.1.3]
rsync: connection unexpectedly closed (53839 bytes received so far) [generator]
rsync error: unexplained error (code 255) at io.c(226) [generator=3.1.3]
nc: connection failed, SOCKS error 5
kex_exchange_identification: Connection closed by remote host
Connection closed by UNKNOWN port 65535
rsync: connection unexpectedly closed (0 bytes received so far) [Receiver]
rsync error: unexplained error (code 255) at io.c(226) [Receiver=3.1.3]
...

It looks to me like this is some Kokoro problem, and likely nothing to do with our build scripts, or build image. I'll follow up w/ the Kokoro folks.

devjgm avatar Feb 01 '21 21:02 devjgm

No recurrence for over 90 days. Closing.

devbww avatar May 17 '21 05:05 devbww

https://source.cloud.google.com/results/invocations/6bc14b80-d0b4-4779-82bc-285b55162b2f/log

dbolduc avatar Jul 10 '21 04:07 dbolduc

https://source.cloud.google.com/results/invocations/b9a56486-e070-4154-8972-8bf5ae8c769f

dbolduc avatar Sep 23 '21 04:09 dbolduc

https://source.cloud.google.com/results/invocations/ad2fcedb-ae10-4da1-aec6-126fd88bfbec

devbww avatar Sep 28 '21 16:09 devbww

https://fusion2.corp.google.com/invocations/a79f5fc1-c170-4fce-bae9-a434bb7a55c6/targets/cloud-cpp%2Fgithub%2Fgoogle-cloud-cpp%2Fmain%2Fmacos%2Fquickstart-bazel/log

devjgm avatar Nov 16 '21 20:11 devjgm

https://source.cloud.google.com/results/invocations/ec37c478-82a7-4aee-87cb-32a241a32d53

devbww avatar Dec 07 '21 07:12 devbww

https://source.cloud.google.com/results/invocations/890fac66-59c5-4b76-90df-7a402f653061

coryan avatar Jan 05 '22 13:01 coryan

https://source.cloud.google.com/results/invocations/6e5f2229-c285-4dba-8b49-26a4a18822d6

devbww avatar Jan 12 '22 17:01 devbww

https://source.cloud.google.com/results/invocations/fb138aa5-4654-4268-ae44-ff6797742ffa

coryan avatar Feb 08 '22 13:02 coryan

https://source.cloud.google.com/results/invocations/76b1bf46-f6cb-49e7-a59e-f02aac502aec

coryan avatar Feb 11 '22 12:02 coryan

https://source.cloud.google.com/results/invocations/d0c9bad4-9332-4470-9518-6a39c68c62b8

coryan avatar Feb 11 '22 12:02 coryan

https://source.cloud.google.com/results/invocations/92406c8a-ba47-4f6d-afa3-626227c9a762

devbww avatar Feb 16 '22 04:02 devbww

https://source.cloud.google.com/results/invocations/ac96c957-42b6-48cd-a37f-8d2699d42995/targets https://source.cloud.google.com/results/invocations/e3593c77-b078-4297-b26b-18bd9b68603b/targets

kokoro_log_reader: Received an interrupt or signal from the OS or another process while sleeping: [Errno 4] Interrupted function call. It's possible that the OS or another process is trying to kill kokoro_log_reader. Attempting to continue anyways.[457/2177] Building CXX object external\googleapis\CMakeFiles\google_cloud_cpp_bigtable_protos.dir\google\bigtable\admin\v2\bigtable_instance_admin.pb.cc.obj

devjgm avatar Mar 16 '22 14:03 devjgm

https://source.cloud.google.com/results/invocations/92bb24b3-c776-40cd-8a81-381058976206

devbww avatar Mar 27 '22 04:03 devbww

https://source.cloud.google.com/results/invocations/aa133041-91d4-4c8e-9f62-84f2ddb26f80

coryan avatar Apr 05 '22 12:04 coryan