gitea icon indicating copy to clipboard operation
gitea copied to clipboard

Preserve unix socket file

Open frankli0324 opened this issue 2 years ago • 21 comments

By default Gitea will always unlink any sockets that are provided using the LISTEN_FDS environment variable. This is because it uses this variable to handle passing when it is doing a graceful restart. However, this same mechanism is used by systemd - which explicitly expects that passed in sockets should not be unlinked by the receiving process.

This PR adjusts Gitea's graceful restart mechanism to use an additional environment variable which tracks if a listening socket was opened by Gitea - and therefore should be unlinked on shutdown by Gitea.

Fix #20490

frankli0324 avatar Jul 27 '22 04:07 frankli0324

I guess the question is if Gitea ever creates these sockets? If it does then we need to unlink them across graceful-restarts.

So is there a way of detecting that we should unlink this socket?

zeripath avatar Jul 28 '22 08:07 zeripath

Yup and outside of Systemd gitea will create the socket for itself:

	// make a fresh listener
	if err := util.Remove(address.Name); err != nil && !os.IsNotExist(err) {
		return nil, fmt.Errorf("Failed to remove unix socket %s: %v", address.Name, err)
	}

	l, err := net.ListenUnix(network, address)
	if err != nil {
		return nil, err
	}

	fileMode := os.FileMode(setting.UnixSocketPermission)
	if err = os.Chmod(address.Name, fileMode); err != nil {
		return nil, fmt.Errorf("Failed to set permission of unix socket to %s: %v", fileMode.String(), err)
	}

Meaning that it should be responsible for unlinking the socket on shutdown.


Which leads to the following questions:

  • Should Gitea always be doing this deletion and creation of sockets?
  • If it finds an extant socket file could it simply use that? - (I'm fairly certain I tried this and it didn't work hence the current code.)
  • How can we detect if when we're passed a socket over FDs whether we're supposed to unlink the sockets or not?

Now... we can leave the first two questions for the moment and consider the last question.

  1. Make (yet) another option that says don't set unlink. This is clearly not ideal.

OR

  1. If we look at how restart and this FD passing actually works you see that it's driven through the common environment variable LISTEN_FDS which we use in graceful-restart too. This indicates how we can tell gitea whether to unlink the fds on shutdown... Set another variable on graceful-restart. UNLINK_FDS which either is a bool to do this unlinking or says which fds should be unlinked.

We might need to store if we created the socket in the first place - this could actually be kinda simple - but I think we might be able to use the LISTEN_FDS and UNLINK_FDS we were ran with to do this actually quite efficiently.

However, I think we can't just apply this PR as it stands because it will leave the socket around which I think will cause problems with other systems.

zeripath avatar Jul 28 '22 09:07 zeripath

no, the function I edited takes the providedListeners, which could only be inherited. they have nothing to do with the sockets gitea created.

https://github.com/go-gitea/gitea/blob/c18d8d6968cb175703942631f0094ffb415f51f4/modules/graceful/net_unix.go#L56-L68

The logic you referenced happens outside the condition, which wouldn't be executed if gitea got a LISTEN_FD

https://github.com/go-gitea/gitea/blob/c18d8d6968cb175703942631f0094ffb415f51f4/modules/graceful/net_unix.go#L165-L179

the function returns (L172) at the point where gitea finds a valid socket.

hence your worries are invalid

frankli0324 avatar Jul 28 '22 09:07 frankli0324

3. If we look at how restart and this FD passing actually works you see that it's driven through the common environment variable LISTEN_FDS which we use in graceful-restart too. This indicates how we can tell gitea whether to unlink the fds on shutdown... Set another variable on graceful-restart. UNLINK_FDS which either is a bool to do this unlinking or says which fds should be unlinked.

this is also a bad, bad idea, at least to me, even if gitea can't tell the fds apart

frankli0324 avatar Jul 28 '22 09:07 frankli0324

Why is it a bad idea?

zeripath avatar Jul 28 '22 09:07 zeripath

How else do you communicate across restarts? It's completely consistent with the use of LISTEN_FDS. How else do you propose that we communicate this information?

zeripath avatar Jul 28 '22 09:07 zeripath

Why is it a bad idea?

I guess it's not important. systemd defines LISTEN_FDS solely for passing sockets to services. it's set by systemd, and should only be set by systemd, not other processes. if gitea takes its freedom to change that, it may:

  • confuse subprocesses
  • google "LISTEN_FDS", and gitea pops up, maybe five years later

frankli0324 avatar Jul 28 '22 09:07 frankli0324

How else do you communicate across restarts? It's completely consistent with the use of LISTEN_FDS. How else do you propose that we communicate this information?

gitea needn't care about that. if LFDS is set, gitea doesn't unlink that. if not, gitea unlinks. why would gitea talk across restarts about that?

frankli0324 avatar Jul 28 '22 09:07 frankli0324

Like many other things, Gitea uses the LISTEN_FDS environment variable to pass sockets across graceful restarts - it does this for ease of use and to keep the code common with systemd. I am simply proposing that Gitea when it does a graceful restart it creates another environment variable UNLINK_FDS which contains the FDS that should also be unlinked. Please look at the graceful restart code if you wish to understand how the LISTEN_FDS are used and/or created.

Please note, systemd was not the first thing to use LISTEN_FDS in this manner - the idea was being used elsewhere, FCGI IIRC was first - It therefore does not have a monopoly of use over the use environment variables to pass sockets around for graceful restart.

zeripath avatar Jul 28 '22 09:07 zeripath

scenarios:

gitea without LFDS

gitea creates socket file, and deletes it after shutdown. unixListener.SetUnlinkOnClose(true) never gets executed because providedListeners is empty.

gitea with LFDS

unixListener.SetUnlinkOnClose(true) gets executed, function returns, gitea never creates socket file

deleted

frankli0324 avatar Jul 28 '22 09:07 frankli0324

Read:

https://github.com/go-gitea/gitea/blob/3bd8f50af819b8dfc86b9fecdfc36ca7774c6a2d/modules/graceful/restart_unix.go#L34-L88

On a graceful restart Gitea sets LISTEN_FDS itself.

zeripath avatar Jul 28 '22 09:07 zeripath

........yes this doesn't fix it i know i need coffee

frankli0324 avatar Jul 28 '22 10:07 frankli0324

well sorry for being aggressive. I thought the code was intended for systemd support in the first place. I'll try making both systemd and graceful restart work.

frankli0324 avatar Jul 28 '22 10:07 frankli0324

as I say above the simplest solution is going to be to use something like UNLINK_FDS

I think this would do it:

Patch
diff --git a/modules/graceful/net_unix.go b/modules/graceful/net_unix.go
index 680ff529a..21c1b31ca 100644
--- a/modules/graceful/net_unix.go
+++ b/modules/graceful/net_unix.go
@@ -23,6 +23,7 @@ import (
 const (
 	listenFDs = "LISTEN_FDS"
 	startFD   = 3
+	unlinkFDs = "UNLINK_FDS"
 )
 
 // In order to keep the working directory the same as when we started we record
@@ -33,8 +34,10 @@ var (
 	once  = sync.Once{}
 	mutex = sync.Mutex{}
 
-	providedListeners = []net.Listener{}
-	activeListeners   = []net.Listener{}
+	providedListenersToUnlink = []bool{}
+	activeListenersToUnlink   = []bool{}
+	providedListeners         = []net.Listener{}
+	activeListeners           = []net.Listener{}
 )
 
 func getProvidedFDs() (savedErr error) {
@@ -53,6 +56,16 @@ func getProvidedFDs() (savedErr error) {
 			return
 		}
 
+		fdsToUnlinkStr := strings.Split(os.Getenv(unlinkFDs), ",")
+		providedListenersToUnlink = make([]bool, n)
+		for _, fdStr := range fdsToUnlinkStr {
+			i, err := strconv.Atoi(fdStr)
+			if err == nil || i < 0 || i >= n+startFD {
+				continue
+			}
+			providedListenersToUnlink[i-startFD] = true
+		}
+
 		for i := startFD; i < n+startFD; i++ {
 			file := os.NewFile(uintptr(i), fmt.Sprintf("listener_FD%d", i))
 
@@ -136,8 +149,11 @@ func GetListenerTCP(network string, address *net.TCPAddr) (*net.TCPListener, err
 	for i, l := range providedListeners {
 		if isSameAddr(l.Addr(), address) {
 			providedListeners = append(providedListeners[:i], providedListeners[i+1:]...)
+			needsUnlink := providedListenersToUnlink[i]
+			providedListenersToUnlink = append(providedListenersToUnlink[:i], providedListenersToUnlink[i+1:]...)
 
 			activeListeners = append(activeListeners, l)
+			activeListenersToUnlink = append(activeListenersToUnlink, needsUnlink)
 			return l.(*net.TCPListener), nil
 		}
 	}
@@ -148,6 +164,7 @@ func GetListenerTCP(network string, address *net.TCPAddr) (*net.TCPListener, err
 		return nil, err
 	}
 	activeListeners = append(activeListeners, l)
+	activeListenersToUnlink = append(activeListenersToUnlink, false)
 	return l, nil
 }
 
@@ -166,9 +183,15 @@ func GetListenerUnix(network string, address *net.UnixAddr) (*net.UnixListener,
 	for i, l := range providedListeners {
 		if isSameAddr(l.Addr(), address) {
 			providedListeners = append(providedListeners[:i], providedListeners[i+1:]...)
+			needsUnlink := providedListenersToUnlink[i]
+			providedListenersToUnlink = append(providedListenersToUnlink[:i], providedListenersToUnlink[i+1:]...)
+
+			activeListenersToUnlink = append(activeListenersToUnlink, needsUnlink)
 			activeListeners = append(activeListeners, l)
 			unixListener := l.(*net.UnixListener)
-			unixListener.SetUnlinkOnClose(true)
+			if needsUnlink {
+				unixListener.SetUnlinkOnClose(true)
+			}
 			return unixListener, nil
 		}
 	}
@@ -189,6 +212,7 @@ func GetListenerUnix(network string, address *net.UnixAddr) (*net.UnixListener,
 	}
 
 	activeListeners = append(activeListeners, l)
+	activeListenersToUnlink = append(activeListenersToUnlink, true)
 	return l, nil
 }
 
@@ -223,3 +247,11 @@ func getActiveListeners() []net.Listener {
 	copy(listeners, activeListeners)
 	return listeners
 }
+
+func getActiveListenersToUnlink() []bool {
+	mutex.Lock()
+	defer mutex.Unlock()
+	listenersToUnlink := make([]bool, len(activeListenersToUnlink))
+	copy(listenersToUnlink, activeListenersToUnlink)
+	return listenersToUnlink
+}
diff --git a/modules/graceful/restart_unix.go b/modules/graceful/restart_unix.go
index 2654ddfb9..1d0d1059e 100644
--- a/modules/graceful/restart_unix.go
+++ b/modules/graceful/restart_unix.go
@@ -12,6 +12,7 @@ import (
 	"net"
 	"os"
 	"os/exec"
+	"strconv"
 	"strings"
 	"sync"
 	"syscall"
@@ -75,6 +76,20 @@ func RestartProcess() (int, error) {
 	}
 	env = append(env, fmt.Sprintf("%s=%d", listenFDs, len(listeners)))
 
+	sb := &strings.Builder{}
+	for i, unlink := range getActiveListenersToUnlink() {
+		if !unlink {
+			continue
+		}
+		_, _ = sb.WriteString(strconv.Itoa(i))
+		_, _ = sb.WriteString(",")
+	}
+	unlinkStr := sb.String()
+	if len(unlinkStr) > 0 {
+		unlinkStr = unlinkStr[:len(unlinkStr)-1]
+		env = append(env, fmt.Sprintf("%s=%s", unlinkFDs, unlinkStr))
+	}
+
 	allFiles := append([]*os.File{os.Stdin, os.Stdout, os.Stderr}, files...)
 	process, err := os.StartProcess(argv0, os.Args, &os.ProcAttr{
 		Dir:   originalWD,

zeripath avatar Jul 28 '22 12:07 zeripath

will gitea unlink the socket file after a second graceful restart?

frankli0324 avatar Jul 28 '22 12:07 frankli0324

If I've written that patch correctly - yes, if Gitea originally opened the listener, otherwise no.

zeripath avatar Jul 28 '22 13:07 zeripath

it seems to me that the patched gitea fails to delete the socket after pkill -1 gitea and then pkill gitea

frankli0324 avatar Jul 28 '22 13:07 frankli0324

~~could we just... rename LISTEN_FDS to something like GITEA_LISTEN_FDS to avoid conflict?~~ I'm not feeling right about UNLINK_FDS, I think there's a better way

frankli0324 avatar Jul 28 '22 13:07 frankli0324

I believe that this construct is illegal?

a := make([]bool, 10)
for i, l := range a {
	fmt.Println(i, l)
	a = append(a[:i], a[i+1:]...)
}

frankli0324 avatar Jul 28 '22 15:07 frankli0324

if go provides a way for getting net.UnixListener.unlink bool attribute, this could be easier. I think this is ready for review...

frankli0324 avatar Jul 28 '22 15:07 frankli0324

I tested with and without systemd, and gitea works as intended. image

image

frankli0324 avatar Jul 28 '22 16:07 frankli0324

make lgtm work

zeripath avatar Aug 13 '22 21:08 zeripath