hub-feedback icon indicating copy to clipboard operation
hub-feedback copied to clipboard

Docker auto build stuck forever

Open darren opened this issue 5 years ago • 54 comments

Problem description

repository: https://hub.docker.com/r/darrenhoo/centos5-go-bootstrap

Build log shows that it is finished, but its status is in in progress

auto-build-hangs

darren avatar Jan 07 '20 02:01 darren

Finnally, it finished:

Created: January 7, 2020 02:37 AM Finished: January 7, 2020 10:37 AM

Queue time: 240m Build time: 240m

darren avatar Jan 07 '20 08:01 darren

Bumping to reopen this. I'm experiencing this today, and

Queue time: 240m
Build time: 240m

looks pretty fishy. Seems like a build completed hook somewhere just isn't working.

meowsbits avatar Mar 06 '20 15:03 meowsbits

Seeing my build stuck today as well, 4 hours now. Also noticing duplicate builds when I push to github

tcosentino avatar Mar 09 '20 22:03 tcosentino

Same have this currently, build complete but still stuck holding up other pushes.

NotExpectedYet avatar Apr 28 '20 17:04 NotExpectedYet

Mine just finally completed like 4 hours after.

It was probably my fault tbh I'd cancelled the build half way through building.

NotExpectedYet avatar Apr 29 '20 06:04 NotExpectedYet

Same have this currently, build complete but still stuck holding up other pushes.

Not sure if it is related to my issue, but somehow I had a second hook created in github and each build was triggering two times. I'm not 100% sure that was related to the build being hung up or not, but when I removed that it seemed to get better.

tcosentino avatar Apr 29 '20 14:04 tcosentino

Currently seeing this in one of my builds, going on for 2 hours but it usually finishes in 15 minutes.

Edit: it finished now image

purefan avatar Apr 29 '20 21:04 purefan

IN PROGRESS forever !

image

Simplatex avatar May 03 '20 12:05 Simplatex

Same have this currently, build complete but still stuck holding up other pushes.

Not sure if it is related to my issue, but somehow I had a second hook created in github and each build was triggering two times. I'm not 100% sure that was related to the build being hung up or not, but when I removed that it seemed to get better.

I have removed duplicate webhooks some time ago, but i'mt still experience this issue. I had this issue on the 23rd April and again today (3rd June). According to the build logs the build finished, but it get's stuck in progress. Cancelling does not help. I sent an email to support on both occasions.

machaven avatar Jun 03 '20 13:06 machaven

Mine was stuck because I'd cancelled it at an in-opportune time in the build process I gather.

I didn't have any second webhooks created anyway.

NotExpectedYet avatar Jun 03 '20 14:06 NotExpectedYet

Piling on. Build finished in 1 minute, build showed in progress for several hours after the build completed, so I hit cancel. Several hours later it's still trying to cancel the build. In short, a 1 minute build has been "running" for 4 hours now.

This issue comes back every few months... the rate of regression on this issue is horrible.

johnvcoleman-w24 avatar Jul 20 '20 20:07 johnvcoleman-w24

I imagine this isnt meant to be a proper deployment solution, I have been using it as such for a hobby project where Watchtower just updates my containers when a new version of the image is available, but perhaps we should just see it as "eventually available"

purefan avatar Jul 21 '20 19:07 purefan

Just got bit by a variant of this on https://hub.docker.com/r/relaysh/pulumi-step-run The build against 9c03e9e finished and pushed successfully according to the log, where the last message was "Build finished" but it was still in progress according to the dashboard. I clicked "Cancel" and now it's in "Cancelling" status and has been for several hours. Further builds are stacked up in "Pending" waiting for this cancellation.

ahpook avatar Jul 24 '20 19:07 ahpook

This is clearly an issue when you cancel a build that has already failed or completed. Not sure why this can't be targeted.

savager avatar Jul 30 '20 02:07 savager

I have the same issues, there is no way to clean

arvtiwar avatar Aug 06 '20 20:08 arvtiwar

Experiencing this today, didn't cancel, just normal auto triggered build

ondrovic avatar Oct 09 '20 16:10 ondrovic

Same here, 1 minute to finish build but 4 hours stuck already

EthraZa avatar Oct 09 '20 20:10 EthraZa

We had the same on Friday and today it's happening again.

machaven avatar Oct 12 '20 11:10 machaven

I have this issue, a C# dotnetcore build failed "DockerBuild -o /app/build' returned a non-zero code: 1" after that it build hanged.

Madpeterz avatar Oct 13 '20 14:10 Madpeterz

workaround: Delete the docker hub repo and recreate with a new name side-effects: Anyone using the old repo name will be forced to update

Madpeterz avatar Oct 13 '20 19:10 Madpeterz

same issue for me, happened twice over 2 weeks. The last time I turned down my remote instance which pulls new images using watchdog, and I noticed it didn't take as long. I wonder if something to do with remote images trying to pull in while its being built? I did notice when I manually tried pulling in new image during time that docker repo was stalled, my remote machine did download the new image (took about an hour though for a 2 min. job normally).

mdrive20 avatar Oct 13 '20 22:10 mdrive20

Same here: built and published in 1 minute, hanging for 2h and counting.

o3bvv avatar Nov 05 '20 13:11 o3bvv

Is there any way to clean ? I tries to stop that also hanged

pupattan avatar Nov 05 '20 18:11 pupattan

Hi everyone, thanks for your feedback.

Along with the recent comments here, we have had a few support tickets come in recently regarding hanging builds. The builds may time out after four hours after becoming stuck. We fixed one cause in September, but I see there have been quite a few reports from everyone after that point.

I've created a sprint ticket on our end to investigate.

shawnaxsom avatar Nov 06 '20 12:11 shawnaxsom

@shawnaxsom This is still very much an issue - do you have any updates?

dylankbuckley avatar Dec 04 '20 12:12 dylankbuckley

The same. Bump

alexks02 avatar Dec 04 '20 13:12 alexks02

Sorry all, @dylankbuckley and @alexks02, work on this issue has been pushed to the current sprint. It's earmarked for one of our engineers to start on after he finishes up his current tasks this sprint, hopefully soon. I'm not sure how difficult of an issue it will be to reproduce and address, so I can't guarantee a timeline, but it is a priority for us.

It may have been a little worse today as processing events backed up slightly during one of our deployments. That engineer is helping get the processing sped up for today's processing build events. I'll share your comments here with him.

shawnaxsom avatar Dec 04 '20 20:12 shawnaxsom

How often has this been happening lately?

We don't think there is a bug that is causing the issue. The comments we've seen in this thread have primarily been a queuing issue, where the queue of builds becomes backed up.

We've changed the services infrastructure in the past month, and we've been addressing some bottlenecks in our overall infrastructure that hopefully have lowered the number of times this has happened to you recently (besides the queue backup 4 days ago). Let me know otherwise if it is happening often (and at what frequency would help).

shawnaxsom avatar Dec 08 '20 16:12 shawnaxsom

Every 2nd day. It bricks up for 4-6 hours.

dylankbuckley avatar Dec 08 '20 16:12 dylankbuckley

Can’t say I’ve had it recently because I avoid it like the plague, but for me it was easy to replicate: accidentally push a branch when it was already building. Triggering another build.

savager avatar Dec 08 '20 16:12 savager