Graceful doesn't return after calling http.Server.Shutdown()
Preflight checklist
- [X] I could not find a solution in the existing issues, docs, nor discussions.
- [X] I agree to follow this project's Code of Conduct.
- [X] I have read and am following this repository's Contribution Guidelines.
- [ ] This issue affects my Ory Cloud project.
- [ ] I have joined the Ory Community Slack.
- [ ] I am signed up to the Ory Security Patch Newsletter.
Describe the bug
The Graceful function blocks in here if you manually shutdown the server.
This was a surprise to me, since the http.Server returned by graceful.WithDefaults has a regular Shutdown method that you can call, yet if you do it, the graceful.Graceful function still hangs waiting for a signal.
Reproducing the bug
This program should exit in 4 seconds, yet it hangs until you send one of the signals used by Graceful.
package main
import (
"context"
"log"
"net/http"
"time"
"github.com/ory/graceful"
)
func main() {
server := graceful.WithDefaults(&http.Server{
Addr: ":8080",
})
// wait 4 seconds and shutdown the server
go func() {
time.Sleep(4 * time.Second)
log.Println("shutting down server")
if err := server.Shutdown(context.Background()); err != nil {
log.Printf("failed to shutdown server: %v", err)
}
}()
log.Println("starting server")
if err := graceful.Graceful(server.ListenAndServe, server.Shutdown); err != nil {
log.Fatalf("failed to gracefully shut down: %v", err)
}
log.Println("server closed")
}
Relevant log output
; go run main.go
2022/06/09 08:42:44 starting server
2022/06/09 08:42:48 shutting down server
^C2022/06/09 08:42:54 server closed
Note how the last line was not printed just after the second line, I had to press Control-C to finish the program instead.
Relevant configuration
No response
Version
v0.1.2
On which operating system are you observing this issue?
Linux
In which environment are you deploying?
Other
Additional Context
When you call Shutdown on the server started with graceful.Graceful the start call here returns with an http.ErrServerClosed, then the execution blocks on line 80 because the goroutine started in line 59 is blocked on line 62 still waiting for a signal to arrive, even though the server is not longer running.
Ideally, when starting a server via graceful, we should still be able to shut it down using the standard method (http.Server.Shutdown) and the Graceful function should return without leaking the goroutine started here.
the go.mod I used to reproduce the bug:
; cat go.mod
module local
go 1.18
require github.com/ory/graceful v0.1.2
require github.com/pkg/errors v0.9.1 // indirect
Hey, that's a great point! Any ideas you have to implement that are highly appreciated!
Here are some things that might help:
-
the goroutine created here waits forever here until a signal arrives. But if we shut down the server manually we want that goroutine to die so we don't leak it.
In order to do so, instead of blocking on
<-stopChanwe canselectonstopChanand some newdonechannel that gets closed in line 76 whenstartreturns. It might be good to callsignal.Stoptoo so the lib has a chance to free the signal handler resources allocated during thesignal.Notify. -
the writes to the
errChanat lines 68 and 72 assume there is a goroutine that is going to read at line 80, but that won't be the case ifstartreturns an error other thanErrServerClosed, therefore the goroutine started at 59 will block forever again on any of those writes.You can fix that by making
errChana buffered channel of size 1 or re-designing the code to take into account the concurrency of signals and people callingShutdownon the server. -
regarding line 76, the code checks for
http.ErrServerClosedas a way to know ifstartfailed because of a signal coming in or for any other reasons. But that assumption fails when you callShutdownon the server manually.Probably the best way to fix that is by adding an explicit way to communicate that a signal has arrived, instead of trying to reuse the
http.ErrServerClosedthat can also be received in other circumstances.
A more general approach might be to rethink the design of the Graceful function taking into account concurrency, it might lead to better results than trying to address each problem stated above separately, or maybe not and you will reach a very similar solution.
Thank you for these ideas! I think another option would be to use
ctx, cancel := context.WithCancel(ctx)
server.RegisterOnShutdown(cancel)
and then listen to ctx.Done() in the subroutine that's blocking.
I think server.RegisterOnShutdown is a nice solution! it even appends to a slice of functions so it will work just fine even if users have already registered their own shutdowns for other purposes before calling Graceful.