pomerium icon indicating copy to clipboard operation
pomerium copied to clipboard

Pomerium Version 0.19.1 appears to not hot reload configs correctly

Open rorylshanks opened this issue 3 years ago • 1 comments

What happened?

Hi there! We run pomerium in Nomad and use consul-template to template in the pomerium config. Using pomerium version 16 this worked perfectly, however we recently upgraded to pomerium version 19 and there is a problem with the hot reloading of configs.

Basically, it seems that the config is never reloaded.

In the logs I can see log messages similar to

{"level":"debug","watch_file":"/local/config.yaml","time":"2022-09-09T08:15:24Z","message":"filemgr: watching file for changes"}

and

{"level":"info","watch_file":"/local/config.yaml","event":"notify.Remove","time":"2022-09-09T08:16:09Z","message":"filemgr: detected file change"} {"level":"info","config_file_source":"/local/config.yaml","config_change_id":"787248e9-a72b-4ff4-8b00-609e9a8094d6","time":"2022-09-09T08:16:09Z","message":"config: file updated, reconfiguring..."} {"level":"info","service":"envoy","name":"upstream","time":"2022-09-09T08:16:09Z","message":"cds: add 1 cluster(s), remove 0 cluster(s)"}

etc. however the new domains are not added, and the metric pomerium_config_last_reload_success_timestamp is not updated with the latest timestamp.

Removing configs appears to work however, as when the template is re-rendered to remove an object, pomerium will 404, however adding does not work.

What did you expect to happen?

Pomerium will watch a file and hot reload a config

How'd it happen?

  1. Run pomerium with a specific configuration which proxies domain x
  2. While pomerium is running, mv a new file without domain x into the path of the old config (note it doesn't use the same inode)
  3. Pomerium should 404 (which is correct)
  4. While pomerium is running, mv a new file with domain x re-added into the path of the config (note it doesn't use the same inode)
  5. Pomerium will still 404

What's your environment like?

  • Pomerium version (retrieve with pomerium --version): v0.19.1
  • Server Operating System/Architecture/Cloud: Ubuntu, running in Nomad

What's your config.yaml?


---
address: :8080
authenticate_service_url: https://pomerium.example.com
cookie_secret: SECRET
databroker_storage_connection_string: postgres://SECRET
databroker_storage_type: postgres
forward_auth_url: https://pomerium.example.com
idp_client_id: SECRET
idp_client_secret: SECRET
idp_provider: azure
idp_provider_url: SECRET
idp_refresh_directory_interval: 10m
idp_refresh_directory_timeout: 5m
idp_service_account: SECRET
insecure_server: true
jwt_claims_headers: email
metrics_address: 0.0.0.0:9090
shared_secret: SECRET
signing_key: SECRET
signing_key_algorithm: RS256

policy:
- from: https://fake.example.com
  to: https://fake.example.com
  allowed_groups:
  - fakegroup1
  - fakegroup2
  allow_websockets: true

What did you see in the logs?

See above

Additional context

Add any other context about the problem here.

rorylshanks avatar Sep 09 '22 10:09 rorylshanks

Hello,

@desimone I see this issue too. I'm using custom pomerium-operator + pomerium 0.18.0. So it's defenitely issue with hot reloading in the k8s.

What I've found:

Due to the scheme how k8s is updating mounted files:

drwxr-xr-x    2 root     root          60 Sep 14 16:18 ..2022_09_14_16_18_51.209307213
lrwxrwxrwx    1 root     root          31 Sep 14 16:18 ..data -> ..2022_09_14_16_18_51.209307213
lrwxrwxrwx    1 root     root          18 Sep 14 16:14 config.yaml -> ..data/config.yaml 

where:

# is the constant symlink to the symlinked folder( it's a constant symlink)
config.yaml -> ..data/config.yaml 
# temp symlink to the latest version of the mounted Object. will be changed after update in secret/config
..data -> ..2022_09_14_16_18_51.209307213 
# actual dir with the data (latest version)
..2022_09_14_16_18_51.209307213 

So when the update happened ..2022_09_14_16_18_51.209307213 will be replaced with the new folder and the ..data symlink will point to the new location.

Why hot reloading isn't working

Source of the problem: Pomerium Core is using notify library (which actually abandoned) for monitoring FS events.

Because of the implementation of Watch(notify/notify.go at master · rjeczalik/notify ) in this library, the symlink will be resolved to the real path of the FS notify/util.go at master · rjeczalik/notify

But this path doesn't exist after the update, so no surprise here actually. @desimone you can simply emulate this behavior of k8s with these 2 scripts:

# init.sh

#!/bin/bash

mkdir '..data-initial'
cat <<EOF > '..data-initial/config.yaml'
<post in the config whatever you want>
EOF

ln -s '..data-initial' '..data'
ln -s '..data/config.yaml' 'config.yaml'
# update.sh

#!/bin/bash

rm -rf '..data-initial'
mkdir '..data-upd'
cat <<EOF > '..data-upd/config.yaml'
initial: 0
upd: true
EOF
ln -fns '..data-upd' '..data'

rm -rf '..data-upd'
mkdir '..data-upd-2'
cat <<EOF > '..data-upd-2/config.yaml'
initial: 0
upd: true
upd2: true
EOF
ln -fns '..data-upd-2' '..data'

Simple code snippet to test against existing implementation:

// https://github.com/pomerium/pomerium/blob/89a105c8e6478196cb9dd40371749c339d274f10/internal/fileutil/watcher.go
// I changed watcher to see events:
func (watcher *Watcher) Add(filePath string, sub string) {
	watcher.mu.Lock()
	defer watcher.mu.Unlock()

	// already watching
	if _, ok := watcher.filePaths[filePath]; ok {
		return
	}
	//ctx := context.TODO()
	ch := make(chan notify.EventInfo, 1)
	go func() {
		for e := range ch {
			log.Println("-----------------------------------------------------------------------------------------")
			log.Printf("[notify]%s|changed:%s,event %s \n", sub, filePath, e.Event().String())
			log.Println("-----------------------------------------------------------------------------------------")
			//watcher.Signal.Broadcast(ctx)
		}
	}()
	err := notify.Watch(filePath, ch, notify.All)
	if err != nil {
		log.Println("[old]: error watching file path")
		notify.Stop(ch)
		close(ch)
		return
	}
	log.Println("[old]watching file for changes")

	watcher.filePaths[filePath] = ch
}
// main.go

func K8SFSEvent(f string) {
	watcher := pm.NewWatcher()
	watcher.Add(f, "file")
	watcher.Add(filepath.Dir(f), "dir")
}

func main() {
	f := os.Getenv("WPATH")
	if len(f) == 0 {
		f = "./config.yaml"
	}
	K8SFSEvent(f)
	for {
		log.Println(time.Now())
		time.Sleep(10 * time.Second)
	}
}

x0ddf avatar Sep 14 '22 18:09 x0ddf

We can update the code to use the fsnotify package. I don't remember why we used the notify package.

calebdoxsey avatar Oct 13 '22 17:10 calebdoxsey

I've implemented it like that: https://github.com/lokkersp/pomerium/commit/51511457ea1d2bf05efa143c99e3b994402ec669

x0ddf avatar Oct 14 '22 06:10 x0ddf