copr
copr copied to clipboard
Not re-notifying FE and msg buses for the new host
There's a bug in the backend <-> fronend protocol -> if copr-rpmbuild is not found on
the builder, the task is never switched to running on FE:
[2023-10-16 07:42:33,269][ INFO][PID:2650505] Marking build as starting
[2023-10-16 07:42:33,460][ INFO][PID:2650505] Trying to allocate VM
[2023-10-16 07:42:36,564][ INFO][PID:2650505] Allocated host ResallocHost, ticket_id=11456, hostname=2620:52:3:1:dead:beef:cafe:c141
[2023-10-16 07:42:36,565][ INFO][PID:2650505] Allocating ssh connection to builder
[2023-10-16 07:42:36,565][ INFO][PID:2650505] Checking that builder machine is OK
[2023-10-16 07:42:36,568][ INFO][PID:2650505] Popen command started: ssh -F /home/copr/.ssh/config mockbuilder@2620:52:3:1:dead:beef:cafe:c141 'rpm -q copr-rpmbuild --qf "%{VERSION}
"'
[2023-10-16 07:42:36,584][ ERROR][PID:2650505] Re-try request for task on 'ResallocHost, ticket_id=11456, hostname=2620:52:3:1:dead:beef:cafe:c141': The copr-rpmbuild package was not found:
[2023-10-16 07:42:36,585][ INFO][PID:2650505] Releasing VM back to pool
[2023-10-16 07:42:36,615][ INFO][PID:2650505] Retry #1 (on other host)
[2023-10-16 07:42:36,617][ INFO][PID:2650505] Trying to allocate VM
[2023-10-16 07:42:42,849][ INFO][PID:2650505] Allocated host ResallocHost, ticket_id=11457, hostname=2620:52:3:1:dead:beef:cafe:c141
[2023-10-16 07:42:42,850][ INFO][PID:2650505] Allocating ssh connection to builder
[2023-10-16 07:42:42,851][ INFO][PID:2650505] Checking that builder machine is OK
[2023-10-16 07:42:42,853][ INFO][PID:2650505] Popen command started: ssh -F /home/copr/.ssh/config mockbuilder@2620:52:3:1:dead:beef:cafe:c141 'rpm -q copr-rpmbuild --qf "%{VERSION}
"'
[2023-10-16 07:42:43,137][ INFO][PID:2650505] Popen command started: ssh -F /home/copr/.ssh/config mockbuilder@2620:52:3:1:dead:beef:cafe:c141 '/usr/bin/test -f /etc/mock/fedora-rawhide-x86_64.cfg'
[2023-10-16 07:42:43,204][ INFO][PID:2650505] Filling build.info file with builder info
[2023-10-16 07:42:43,205][ INFO][PID:2650505] Not re-notifying FE and msg buses for the new host.
[2023-10-16 07:42:43,206][ INFO][PID:2650505] Starting remote build: copr-rpmbuild --verbose --drop-resultdir --task-url https://copr.stg.fedoraproject.org/backend/get-build-task/2915642-fedora-rawhide-x86_64 --chroot fedora-rawhide-x86_64 --detached
[2023-10-16 07:42:43,207][ INFO][PID:2650505] Popen command started: ssh -F /home/copr/.ssh/config mockbuilder@2620:52:3:1:dead:beef:cafe:c141 'copr-rpmbuild --verbose --drop-resultdir --task-url https://copr.stg.fedoraproject.org/backend/get-build-task/2915642-fedora-rawhide-x86_64 --chroot fedora-rawhide-x86_64 --detached'
[2023-10-16 07:42:43,509][ INFO][PID:2650505] Downloading the builder-live.log file, attempt 1
[2023-10-16 07:42:43,511][ INFO][PID:2650505] Popen command started: ssh -F /home/copr/.ssh/config mockbuilder@2620:52:3:1:dead:beef:cafe:c141 copr-rpmbuild-log
- we take first machine
- the machine fails too early (the build doesn't start), so the ticket stays in
starting - no message to BUS is sent either (automation breakage)
- another machine is taken
- we don't send BUS message now (this is the bug!) so no automation can react
This seems to rarely happen, therefore we give this "2years" category. Please report if this happens frequently.