gitpod
gitpod copied to clipboard
Unexpected error loading prebuild
Bug description
rpc error: code = FailedPrecondition desc = cannot initialize workspace: prebuild initializer: Git fallback: git initializer gitClone: mkdir /dst/spring-petclinic: no such file or directory

Workspace affected
gitpodio-springpetclini-8a38a5a57eu
Expected behavior
- Log when this happens (we don't now).
- Ideally we'd wait longer until the file is ready, before trying to clone
Example repository
None
Anything else?
How long do we wait before the file system is ready?
We already have log entry for this: https://github.com/gitpod-io/gitpod/blob/6390f2064394ebccf788f4ff5b57fd66e0313ce1/components/content-service/pkg/initializer/git.go#L75
Also looking at logs, it looks like workspace failed, and then 5 minutes later ws-daemon tried to run initializer for it (???):

I cannot quite understand what exactly happened to that workspace lifecycle.
Thanks for looking at this one, @sagor999 ! @jenting prior to resuming PVC work, could you peek at this to see what you can find? It's the last of broken windows we found from gen59 traces.
- Log when this happens (we didn't for now).
We log it already
- Ideally we'd wait longer until the file is ready, before trying to clone
We haven't run git clone; the error reports on os.MkdirAll(ws.Location, 0775) failed 🤔
I thought the error happens at this line https://github.com/gitpod-io/gitpod/blob/3d97d5552ec092938327c3813d53a04038c3db7f/components/content-service/pkg/initializer/git.go#L75
However, I did not find the span isGitWS in the tracing https://github.com/gitpod-io/gitpod/blob/3d97d5552ec092938327c3813d53a04038c3db7f/components/content-service/pkg/initializer/git.go#L64 🤔
I'm blocked for this issue now.
I can't reproduce it locally to make the os.MkdirAll(ws.Location, 0775) failed with the reason no such file or directory. Is this because the mount point /dst/ is not ready yet?
:wave: @jenting were you able to find anything meaningful via Google searches, or recreate similar misbehavior in https://go.dev/play/? I ask so that we can have that context when sharing this issue with a teammate next week.
For now, let's leave blocked. And later this week I'll inspect frequency for this error. Frequency will determine if we reassign to another teammate while you're out on vacation, etc.
👋 @jenting were you able to find anything meaningful via Google searches, or recreate similar misbehavior in https://go.dev/play/? I ask so that we can have that context when sharing this issue with a teammate next week.
I did some google searching and write a similar code locally to reproduce the error. But no luck so far.
This is odd. From the log, the error comes from here. However, I can't see this line warning log within GCP log.
Note: we filter by instanceId="a7ad0fe1-3ebc-4786-a207-366f8c7c1e47"
@jenting are you still blocked and need help from the team (if yes please reach out in #t_workspace), or, do you have more info to go on now because of this thread?
If this PR doesn't fix this problem, we have to write code to check if the container is still alive. https://github.com/gitpod-io/gitpod/pull/12215
This is related to this: https://github.com/gitpod-io/gitpod/issues/12282
If StopWorkspace was called while workspace was still doing content init, then it may fail with this exact error, as ws-daemon does not know that workspace was stopped and /dst has disappeared.
This is related to this: #12282 If StopWorkspace was called while workspace was still doing content init, then it may fail with this exact error, as ws-daemon does not know that workspace was stopped and
/dsthas disappeared.
I just wonder who deletes $wsRoot/dst? kubelet or our component? Do you know?
Could be that our housekeeping job in ws-daemon does that? :thinking:
I just wonder who deletes
$wsRoot/dst? kubelet or our component? Do you know?
I have the same question.
Since we are not sure whether #12282 addressed this issue or not, we might need to consider reopening this issue.
There are probably two patterns in this issue
- Doing Initializing after the stop request https://github.com/gitpod-io/gitpod/issues/12282
- For some reason, the initializeing starts very late.
I put the log links of those two patterns in this PR. Perhaps this PR will improve both, but it is unclear if they will be resolved. https://github.com/gitpod-io/gitpod/pull/12215
So, I think if it happens again, we should reopen.
Perhaps this PR will improve both, but it is unclear if they will be resolved.
We need to check the jaeger tracing on the gen63 cluster to see if it still happens or not.
It still happens 😭 https://cloudlogging.app.goo.gl/4Hk68KGGpKBS1wyk9
FYI, we need webapp to add logging, so that we can know "why" stop workspace is being called via https://github.com/gitpod-io/gitpod/issues/12282. Once that is done, then we can proceed with this particular issue.
Added Blocked label, because we're waiting for webapp to schedule and do logging in https://github.com/gitpod-io/gitpod/issues/12282
This is no longer blocked as of https://github.com/gitpod-io/gitpod/issues/12283
During the refinement meeting, @utam0k mentioned he saw the error recently. We don't know exactly how to approach other than looking at the new logs and trying to understand what's going on. There's currently no hypothesis.
@jenting @utam0k As this is Scheduled, and not In-progress, I removed you both from assigned. This way, it is "free" for later, when someone has bandwidth, then it can be assigned and status changed accordingly. :smile: Have a nice day you two! :wave:
@kylos101 :100: Thanks
@sagor999 could you peek at the new logs, to see why the workspaces are stopping, to help form a plan of attack for this? I'm going to move this from Scheduled to the Inbox for now.
Hm. I looked in traces (US) and looked in GCP logs for that error and cannot find one. :thinking:
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

There are a few error messages so I closed