mil
mil copied to clipboard
General Protection Fault in GitHub Actions container in `balin`
What needs to change?
When running catkin_make -j10
(with 10
being the number of processes limited to the Docker container), the Python executable is getting killed roughly 20 seconds into the build process, around 15%-20% completion. This issue occurs within a Docker container that has been moved between computers.
some details:
-
The
dmesg
output shows the following trap:[12151.484884] traps: python3[2025829] general protection fault ip:56b5d5 sp:7ffd85f2c0a0 error:0 in python3.8[423000+294000]
-
I suspect that the
python3
process in the error log corresponds tocatkin_make
, ascatkin_make
runs using Python. Other Python processes are running as well, but they appear to exit cleanly. -
The
catkin_make
output indicates that it is being terminated unexpectedly.
Possible Causes:
-
This error is happening inside a Docker container, which should be reusable and composable. The error could be hardware-related, as suggested by a similar Proxmox forum post. The container has been moved between computers multiple times, and this is the first occurrence of such errors.
-
Additionally, there is an issue with
pip
on this computer: it often downloads files with bad CRC checks. While it usually succeeds after a couple of tries, the first attempt often fails. This could be another indicator of a deeper issue.
How would this task be tested?
- Ensure that CI is able to run okay!