github-act-runner icon indicating copy to clipboard operation
github-act-runner copied to clipboard

SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x7b1c1b

Open hasufell opened this issue 5 months ago • 8 comments

During execution of a job, I get:

Failed to find the SystemVssConnection Endpoint, try to finish job as failedpanic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x58 pc=0x7b466e]

goroutine 177657 [running]:
github.com/ChristopherHX/github-act-runner/actionsrunner.runJob.func1.2()
        /home/runner/work/github-act-runner/github-act-runner/actionsrunner/runner.go:492 +0xee
created by github.com/ChristopherHX/github-act-runner/actionsrunner.runJob.func1 in goroutine 177597
        /home/runner/work/github-act-runner/github-act-runner/actionsrunner/runner.go:487 +0xb05

Then trying to restart I get immediately:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x7b1c1b]

goroutine 23 [running]:
github.com/ChristopherHX/github-act-runner/protocol.(*VssConnection).FinishJob(...)
        /home/runner/work/github-act-runner/github-act-runner/protocol/connection.go:375
github.com/ChristopherHX/github-act-runner/actionsrunner.(*RunRunner).Run.func5.2()
        /home/runner/work/github-act-runner/github-act-runner/actionsrunner/runner.go:174 +0x11b
created by github.com/ChristopherHX/github-act-runner/actionsrunner.(*RunRunner).Run.func5 in goroutine 22
        /home/runner/work/github-act-runner/github-act-runner/actionsrunner/runner.go:172 +0x54f

After a couple of attempts it will eventually start.

This makes my runner go down every now and then.

hasufell avatar Aug 08 '25 08:08 hasufell

During execution of a job, I get:

Your tenant might have been upgraded, or my test job is running too short.

Does the job crashed before showing progress on GitHub?

Could you run your runner using --trace. A job should have always have a SystemVssConnection, crash 1 in https://github.com/ChristopherHX/github-act-runner/blob/d77360ee0369d8331e08aa95738c86700e1f7a38/actionsrunner/runner.go#L492

And possible send the log privately over email (tokens everywhere that you should censor, maybe a bad idea to share that if your job uses custom secrets)

My fault of ignoring the error of my function, if this routine does not work your job can only take about 10minutes

Crash 2

I need to deactivate this feature for the newer GitHub Actions Backend.


Otherwise I can improve error handling of the crashed code position, eventually this is enough.

ChristopherHX avatar Aug 08 '25 15:08 ChristopherHX

I ran into it again. The last GET request in the log was https://broker.actions.githubusercontent.com/message?runnerVersion=3.0.0&sessionId=<session-id>&status=Busy

And then:

golang_7eb16954-092a-47d0-a14c-d1e0dda1ca47 ( https://github.com/haskell ): Running Job ''
golang_7eb16954-092a-47d0-a14c-d1e0dda1ca47 ( https://github.com/haskell ): Failed to find the SystemVssConnection Endpoint, try to finish job as f
ailedpanic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x7b5238]

goroutine 107389 [running]:
github.com/ChristopherHX/github-act-runner/protocol.(*VssConnection).FinishJob(...)
        /home/runner/work/github-act-runner/github-act-runner/protocol/connection.go:375
github.com/ChristopherHX/github-act-runner/actionsrunner.(*DefaultWorkerContext).FinishJob(0x86ce905c0, {0xc22b9a, 0x6}, 0x86cd38068)
        /home/runner/work/github-act-runner/github-act-runner/actionsrunner/worker_context.go:85 +0x338
github.com/ChristopherHX/github-act-runner/actionsrunner.(*DefaultWorkerContext).Init(0x86ce905c0)
        /home/runner/work/github-act-runner/github-act-runner/actionsrunner/worker_context.go:140 +0x798
github.com/ChristopherHX/github-act-runner/actionsrunner.runJob.func1()
        /home/runner/work/github-act-runner/github-act-runner/actionsrunner/runner.go:531 +0xcf4
created by github.com/ChristopherHX/github-act-runner/actionsrunner.runJob in goroutine 105658
        /home/runner/work/github-act-runner/github-act-runner/actionsrunner/runner.go:414 +0x250

hasufell avatar Oct 27 '25 11:10 hasufell

I upgraded to 0.12.0 and now I get:

Body: `{"source":"actions-run-service","statusCode":409,"errorMessage":"job assignment is invalid: MissingKey"}`
Error: fatal error, see log

hasufell avatar Oct 27 '25 11:10 hasufell

github-act-runner has basically become unusable for me. It's crashing all the time.

hasufell avatar Oct 27 '25 11:10 hasufell

Now I have the case where the agent seems to continue running when I check the console on the server, but the github UI tells me it's offline. This happens very quickly after I start the runner.

hasufell avatar Oct 27 '25 11:10 hasufell

I upgraded to 0.12.0 and now I get:

Body: `{"source":"actions-run-service","statusCode":409,"errorMessage":"job assignment is invalid: MissingKey"}`
Error: fatal error, see log

This print is not a fatal error, for me in 0.12.0. I believe this is a race condition in GitHub's backend.

Maybe you have shorten the log way to much, known errors cause fatal error are

  • runner removed from GitHub, has a different message
  • you executed github-act-runner run --once and something bad happend, has a different message

I would try remove the runner locally, by deleting

  • sessions.json
  • settings.json

And configure it again to GitHub Actions, this runner allows you to configure a single runner multiple times to listen for a group of repos and orgs.


I can confirm your first message that you still had this panic issue before updating to v0.12.0.

but the github UI tells me it's offline. This happens very quickly after I start the runner.

Hmm, at what exact registration level are you using this runner?

  • Repository Runner Level is perfectly fine for me right now
  • Org Level not Covered by my testing workflows right now
  • Non Ephemeral Runners are not Covered by my testing workflows right now

ChristopherHX avatar Oct 27 '25 12:10 ChristopherHX

Was a while without crash, now back at it:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x100 pc=0x9ab43d]

goroutine 146751 [running]:
nhooyr.io/websocket.(*Conn).writer(0x0, {0xdfb130?, 0x86c9785b0?}, 0x419ae5?)
        /home/runner/go/pkg/mod/nhooyr.io/[email protected]/write.go:111 +0x1d
nhooyr.io/websocket.(*Conn).Writer(0x86c9785b0?, {0xdfb130?, 0x86c9785b0?}, 0xdf6820?)
        /home/runner/go/pkg/mod/nhooyr.io/[email protected]/write.go:28 +0x1d
nhooyr.io/websocket/wsjson.write({0xdfb130?, 0x86c9785b0?}, 0x12a05f200?, {0xb599a0, 0x86cdeb340})
        /home/runner/go/pkg/mod/nhooyr.io/[email protected]/wsjson/wsjson.go:54 +0x99
nhooyr.io/websocket/wsjson.Write(...)
        /home/runner/go/pkg/mod/nhooyr.io/[email protected]/wsjson/wsjson.go:48
github.com/ChristopherHX/github-act-runner/protocol/logger.(*WebsocketLivelogger).SendLog(0x86ced8f00, 0x86cdeb340)
        /home/runner/work/github-act-runner/github-act-runner/protocol/logger/job_logger.go:95 +0x6f
github.com/ChristopherHX/github-act-runner/protocol/logger.(*WebsocketLiveloggerWithFallback).SendLog(0x86cf15400, 0x86cdeb340)
        /home/runner/work/github-act-runner/github-act-runner/protocol/logger/job_logger.go:148 +0x4c
github.com/ChristopherHX/github-act-runner/protocol/logger.(*BufferedLiveLogger).sendLogs(0x86cc94f80, 0x86c74ae70, 0x86ca9ea80?)
        /home/runner/work/github-act-runner/github-act-runner/protocol/logger/job_logger.go:225 +0x7f
created by github.com/ChristopherHX/github-act-runner/protocol/logger.(*BufferedLiveLogger).SendLog in goroutine 146730
        /home/runner/work/github-act-runner/github-act-runner/protocol/logger/job_logger.go:247 +0x107

hasufell avatar Dec 03 '25 09:12 hasufell

I created an patch in https://github.com/ChristopherHX/github-act-runner/pull/226, but this sightly reveals how many error handling problem exist in my runner application. (The reason here is websocket disconnects, and reconnecting failed now the connection handle is null, unexpected for the code)

ChristopherHX avatar Dec 08 '25 19:12 ChristopherHX