pomerium
pomerium copied to clipboard
Pomerium Version 0.19.1 appears to not hot reload configs correctly
What happened?
Hi there! We run pomerium in Nomad and use consul-template to template in the pomerium config. Using pomerium version 16 this worked perfectly, however we recently upgraded to pomerium version 19 and there is a problem with the hot reloading of configs.
Basically, it seems that the config is never reloaded.
In the logs I can see log messages similar to
{"level":"debug","watch_file":"/local/config.yaml","time":"2022-09-09T08:15:24Z","message":"filemgr: watching file for changes"}
and
{"level":"info","watch_file":"/local/config.yaml","event":"notify.Remove","time":"2022-09-09T08:16:09Z","message":"filemgr: detected file change"} {"level":"info","config_file_source":"/local/config.yaml","config_change_id":"787248e9-a72b-4ff4-8b00-609e9a8094d6","time":"2022-09-09T08:16:09Z","message":"config: file updated, reconfiguring..."} {"level":"info","service":"envoy","name":"upstream","time":"2022-09-09T08:16:09Z","message":"cds: add 1 cluster(s), remove 0 cluster(s)"}
etc. however the new domains are not added, and the metric pomerium_config_last_reload_success_timestamp is not updated with the latest timestamp.
Removing configs appears to work however, as when the template is re-rendered to remove an object, pomerium will 404, however adding does not work.
What did you expect to happen?
Pomerium will watch a file and hot reload a config
How'd it happen?
- Run pomerium with a specific configuration which proxies domain x
- While pomerium is running, mv a new file without domain x into the path of the old config (note it doesn't use the same inode)
- Pomerium should 404 (which is correct)
- While pomerium is running, mv a new file with domain x re-added into the path of the config (note it doesn't use the same inode)
- Pomerium will still 404
What's your environment like?
- Pomerium version (retrieve with
pomerium --version): v0.19.1 - Server Operating System/Architecture/Cloud: Ubuntu, running in Nomad
What's your config.yaml?
---
address: :8080
authenticate_service_url: https://pomerium.example.com
cookie_secret: SECRET
databroker_storage_connection_string: postgres://SECRET
databroker_storage_type: postgres
forward_auth_url: https://pomerium.example.com
idp_client_id: SECRET
idp_client_secret: SECRET
idp_provider: azure
idp_provider_url: SECRET
idp_refresh_directory_interval: 10m
idp_refresh_directory_timeout: 5m
idp_service_account: SECRET
insecure_server: true
jwt_claims_headers: email
metrics_address: 0.0.0.0:9090
shared_secret: SECRET
signing_key: SECRET
signing_key_algorithm: RS256
policy:
- from: https://fake.example.com
to: https://fake.example.com
allowed_groups:
- fakegroup1
- fakegroup2
allow_websockets: true
What did you see in the logs?
See above
Additional context
Add any other context about the problem here.
Hello,
@desimone I see this issue too. I'm using custom pomerium-operator + pomerium 0.18.0. So it's defenitely issue with hot reloading in the k8s.
What I've found:
Due to the scheme how k8s is updating mounted files:
drwxr-xr-x 2 root root 60 Sep 14 16:18 ..2022_09_14_16_18_51.209307213
lrwxrwxrwx 1 root root 31 Sep 14 16:18 ..data -> ..2022_09_14_16_18_51.209307213
lrwxrwxrwx 1 root root 18 Sep 14 16:14 config.yaml -> ..data/config.yaml
where:
# is the constant symlink to the symlinked folder( it's a constant symlink)
config.yaml -> ..data/config.yaml
# temp symlink to the latest version of the mounted Object. will be changed after update in secret/config
..data -> ..2022_09_14_16_18_51.209307213
# actual dir with the data (latest version)
..2022_09_14_16_18_51.209307213
So when the update happened ..2022_09_14_16_18_51.209307213 will be replaced with the new folder and the ..data symlink will point to the new location.
Why hot reloading isn't working
Source of the problem: Pomerium Core is using notify library (which actually abandoned) for monitoring FS events.
Because of the implementation of Watch(notify/notify.go at master · rjeczalik/notify ) in this library, the symlink will be resolved to the real path of the FS notify/util.go at master · rjeczalik/notify
But this path doesn't exist after the update, so no surprise here actually. @desimone you can simply emulate this behavior of k8s with these 2 scripts:
# init.sh
#!/bin/bash
mkdir '..data-initial'
cat <<EOF > '..data-initial/config.yaml'
<post in the config whatever you want>
EOF
ln -s '..data-initial' '..data'
ln -s '..data/config.yaml' 'config.yaml'
# update.sh
#!/bin/bash
rm -rf '..data-initial'
mkdir '..data-upd'
cat <<EOF > '..data-upd/config.yaml'
initial: 0
upd: true
EOF
ln -fns '..data-upd' '..data'
rm -rf '..data-upd'
mkdir '..data-upd-2'
cat <<EOF > '..data-upd-2/config.yaml'
initial: 0
upd: true
upd2: true
EOF
ln -fns '..data-upd-2' '..data'
Simple code snippet to test against existing implementation:
// https://github.com/pomerium/pomerium/blob/89a105c8e6478196cb9dd40371749c339d274f10/internal/fileutil/watcher.go
// I changed watcher to see events:
func (watcher *Watcher) Add(filePath string, sub string) {
watcher.mu.Lock()
defer watcher.mu.Unlock()
// already watching
if _, ok := watcher.filePaths[filePath]; ok {
return
}
//ctx := context.TODO()
ch := make(chan notify.EventInfo, 1)
go func() {
for e := range ch {
log.Println("-----------------------------------------------------------------------------------------")
log.Printf("[notify]%s|changed:%s,event %s \n", sub, filePath, e.Event().String())
log.Println("-----------------------------------------------------------------------------------------")
//watcher.Signal.Broadcast(ctx)
}
}()
err := notify.Watch(filePath, ch, notify.All)
if err != nil {
log.Println("[old]: error watching file path")
notify.Stop(ch)
close(ch)
return
}
log.Println("[old]watching file for changes")
watcher.filePaths[filePath] = ch
}
// main.go
func K8SFSEvent(f string) {
watcher := pm.NewWatcher()
watcher.Add(f, "file")
watcher.Add(filepath.Dir(f), "dir")
}
func main() {
f := os.Getenv("WPATH")
if len(f) == 0 {
f = "./config.yaml"
}
K8SFSEvent(f)
for {
log.Println(time.Now())
time.Sleep(10 * time.Second)
}
}
We can update the code to use the fsnotify package. I don't remember why we used the notify package.
I've implemented it like that: https://github.com/lokkersp/pomerium/commit/51511457ea1d2bf05efa143c99e3b994402ec669