SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x7b1c1b
During execution of a job, I get:
Failed to find the SystemVssConnection Endpoint, try to finish job as failedpanic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x58 pc=0x7b466e]
goroutine 177657 [running]:
github.com/ChristopherHX/github-act-runner/actionsrunner.runJob.func1.2()
/home/runner/work/github-act-runner/github-act-runner/actionsrunner/runner.go:492 +0xee
created by github.com/ChristopherHX/github-act-runner/actionsrunner.runJob.func1 in goroutine 177597
/home/runner/work/github-act-runner/github-act-runner/actionsrunner/runner.go:487 +0xb05
Then trying to restart I get immediately:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x7b1c1b]
goroutine 23 [running]:
github.com/ChristopherHX/github-act-runner/protocol.(*VssConnection).FinishJob(...)
/home/runner/work/github-act-runner/github-act-runner/protocol/connection.go:375
github.com/ChristopherHX/github-act-runner/actionsrunner.(*RunRunner).Run.func5.2()
/home/runner/work/github-act-runner/github-act-runner/actionsrunner/runner.go:174 +0x11b
created by github.com/ChristopherHX/github-act-runner/actionsrunner.(*RunRunner).Run.func5 in goroutine 22
/home/runner/work/github-act-runner/github-act-runner/actionsrunner/runner.go:172 +0x54f
After a couple of attempts it will eventually start.
This makes my runner go down every now and then.
During execution of a job, I get:
Your tenant might have been upgraded, or my test job is running too short.
Does the job crashed before showing progress on GitHub?
Could you run your runner using --trace. A job should have always have a SystemVssConnection, crash 1 in
https://github.com/ChristopherHX/github-act-runner/blob/d77360ee0369d8331e08aa95738c86700e1f7a38/actionsrunner/runner.go#L492
And possible send the log privately over email (tokens everywhere that you should censor, maybe a bad idea to share that if your job uses custom secrets)
My fault of ignoring the error of my function, if this routine does not work your job can only take about 10minutes
Crash 2
I need to deactivate this feature for the newer GitHub Actions Backend.
Otherwise I can improve error handling of the crashed code position, eventually this is enough.
I ran into it again. The last GET request in the log was https://broker.actions.githubusercontent.com/message?runnerVersion=3.0.0&sessionId=<session-id>&status=Busy
And then:
golang_7eb16954-092a-47d0-a14c-d1e0dda1ca47 ( https://github.com/haskell ): Running Job ''
golang_7eb16954-092a-47d0-a14c-d1e0dda1ca47 ( https://github.com/haskell ): Failed to find the SystemVssConnection Endpoint, try to finish job as f
ailedpanic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x7b5238]
goroutine 107389 [running]:
github.com/ChristopherHX/github-act-runner/protocol.(*VssConnection).FinishJob(...)
/home/runner/work/github-act-runner/github-act-runner/protocol/connection.go:375
github.com/ChristopherHX/github-act-runner/actionsrunner.(*DefaultWorkerContext).FinishJob(0x86ce905c0, {0xc22b9a, 0x6}, 0x86cd38068)
/home/runner/work/github-act-runner/github-act-runner/actionsrunner/worker_context.go:85 +0x338
github.com/ChristopherHX/github-act-runner/actionsrunner.(*DefaultWorkerContext).Init(0x86ce905c0)
/home/runner/work/github-act-runner/github-act-runner/actionsrunner/worker_context.go:140 +0x798
github.com/ChristopherHX/github-act-runner/actionsrunner.runJob.func1()
/home/runner/work/github-act-runner/github-act-runner/actionsrunner/runner.go:531 +0xcf4
created by github.com/ChristopherHX/github-act-runner/actionsrunner.runJob in goroutine 105658
/home/runner/work/github-act-runner/github-act-runner/actionsrunner/runner.go:414 +0x250
I upgraded to 0.12.0 and now I get:
Body: `{"source":"actions-run-service","statusCode":409,"errorMessage":"job assignment is invalid: MissingKey"}`
Error: fatal error, see log
github-act-runner has basically become unusable for me. It's crashing all the time.
Now I have the case where the agent seems to continue running when I check the console on the server, but the github UI tells me it's offline. This happens very quickly after I start the runner.
I upgraded to 0.12.0 and now I get:
Body: `{"source":"actions-run-service","statusCode":409,"errorMessage":"job assignment is invalid: MissingKey"}` Error: fatal error, see log
This print is not a fatal error, for me in 0.12.0. I believe this is a race condition in GitHub's backend.
Maybe you have shorten the log way to much, known errors cause fatal error are
- runner removed from GitHub, has a different message
- you executed
github-act-runner run --onceand something bad happend, has a different message
I would try remove the runner locally, by deleting
- sessions.json
- settings.json
And configure it again to GitHub Actions, this runner allows you to configure a single runner multiple times to listen for a group of repos and orgs.
I can confirm your first message that you still had this panic issue before updating to v0.12.0.
but the github UI tells me it's offline. This happens very quickly after I start the runner.
Hmm, at what exact registration level are you using this runner?
- Repository Runner Level is perfectly fine for me right now
- Org Level not Covered by my testing workflows right now
- Non Ephemeral Runners are not Covered by my testing workflows right now
Was a while without crash, now back at it:
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x100 pc=0x9ab43d]
goroutine 146751 [running]:
nhooyr.io/websocket.(*Conn).writer(0x0, {0xdfb130?, 0x86c9785b0?}, 0x419ae5?)
/home/runner/go/pkg/mod/nhooyr.io/[email protected]/write.go:111 +0x1d
nhooyr.io/websocket.(*Conn).Writer(0x86c9785b0?, {0xdfb130?, 0x86c9785b0?}, 0xdf6820?)
/home/runner/go/pkg/mod/nhooyr.io/[email protected]/write.go:28 +0x1d
nhooyr.io/websocket/wsjson.write({0xdfb130?, 0x86c9785b0?}, 0x12a05f200?, {0xb599a0, 0x86cdeb340})
/home/runner/go/pkg/mod/nhooyr.io/[email protected]/wsjson/wsjson.go:54 +0x99
nhooyr.io/websocket/wsjson.Write(...)
/home/runner/go/pkg/mod/nhooyr.io/[email protected]/wsjson/wsjson.go:48
github.com/ChristopherHX/github-act-runner/protocol/logger.(*WebsocketLivelogger).SendLog(0x86ced8f00, 0x86cdeb340)
/home/runner/work/github-act-runner/github-act-runner/protocol/logger/job_logger.go:95 +0x6f
github.com/ChristopherHX/github-act-runner/protocol/logger.(*WebsocketLiveloggerWithFallback).SendLog(0x86cf15400, 0x86cdeb340)
/home/runner/work/github-act-runner/github-act-runner/protocol/logger/job_logger.go:148 +0x4c
github.com/ChristopherHX/github-act-runner/protocol/logger.(*BufferedLiveLogger).sendLogs(0x86cc94f80, 0x86c74ae70, 0x86ca9ea80?)
/home/runner/work/github-act-runner/github-act-runner/protocol/logger/job_logger.go:225 +0x7f
created by github.com/ChristopherHX/github-act-runner/protocol/logger.(*BufferedLiveLogger).SendLog in goroutine 146730
/home/runner/work/github-act-runner/github-act-runner/protocol/logger/job_logger.go:247 +0x107
I created an patch in https://github.com/ChristopherHX/github-act-runner/pull/226, but this sightly reveals how many error handling problem exist in my runner application. (The reason here is websocket disconnects, and reconnecting failed now the connection handle is null, unexpected for the code)