rebuilderd icon indicating copy to clipboard operation
rebuilderd copied to clipboard

"phantom" builds

Open h01ger opened this issue 10 months ago • 4 comments

h01ger> libclass-xsaccessor-perl is building since 53h... killing it
h01ger> i'm actually surprised rebuilderd doesnt kill it
jochensp> (I think it is an error on the server side, at least there is no worker on it)
h01ger> except its not building on o4?!?
h01ger> yup
jochensp> same for mpich on i386
h01ger> i guess a bug would be good or maybe not as there's not much info i could provide
h01ger> right
jochensp> I guess the DB state would be interesting but probably something for kp
h01ger> and two haskell packages armhf
jochensp> heh
jochensp> interesting that we have so many just now after so many month
h01ger> i think its so much that it defintily deserves a bug
h01ger> jochensp: ok to copy this chatlog into an github issue?
jochensp> sure

I also notice that on armhf, which has 5 workers, there are those 2 haskell builds going on since 56h and 3 other builds. So this might be an issue with rebuilder-worker... investigating now but also filing this issue now... :)

h01ger avatar Feb 23 '25 20:02 h01ger

and indeed the first worker I checked had this in it's screen session:

...
[2025-02-21T11:44:44Z INFO  rebuilderd_worker::download] Downloaded 191416 bytes
[2025-02-21T11:44:44Z INFO  rebuilderd_worker::download] Downloading "http://deb.debian.org/debian/pool/main/h/haskell-th-orphans/libghc-th-orphans-prof_0.13.14-3+b1_armhf.deb" to "/tmp/rebuilderdiaD5KV/inputs/libghc-th-orphans-prof_0.13.14-3+b1_armhf.deb"
[2025-02-21T11:44:44Z INFO  rebuilderd_worker::download] Downloaded 205284 bytes
[2025-02-21T11:44:44Z INFO  rebuilderd_worker::download] Downloading "https://buildinfos.debian.net/buildinfo-pool/h/haskell-th-orphans/haskell-th-orphans_0.13.14-3+b1_armhf.buildinfo" to "/tmp/rebuilderdiaD5KV/inputs/haskell-th-orphans_0.13.14-3+b1_armhf.buildinfo"
[2025-02-22T13:09:14Z WARN  rebuilderd_worker] Failed to ping: HTTP status server error (500 Internal Server Error) for url (https://reproduce.debian.net/armhf/api/v0/build/ping)

this nicely matches 56h ago now (feb 21)... so somehow the worker should, ahem, behave differently. As in restart all over...

h01ger avatar Feb 23 '25 20:02 h01ger

for the record, this happened yesterday again on two workers...

h01ger avatar Mar 03 '25 15:03 h01ger

Are there any child processes under rebuilderd-worker when that happens (especially, for example, in Z state)?

kpcyrd avatar Mar 03 '25 15:03 kpcyrd

I'll try to remember to look the next time(s) I see this happening. :-)

h01ger avatar Mar 03 '25 15:03 h01ger