efm_config_post hook
Proposed fix
In our setup we run into issues with efm. In short: It is a sudo EXEC issue due to our higher security policies. More below...
Therefore I would like to suggest an efm_post_config hook which would run just after all config steps and can be used to fix issues like these. This may well be a very specific issue on our end, but others may have other issues which might use a efm_post_config hook as well.
I have created a PR (#249) for a resolution.
EFM sudo EXEC Issue
In short, we have set sudo config Default NOEXEC which means that all sudo rules require EXEC to be set if they execute other commands.
This is a security setting which we set on all systems, cannot leave out for Postgres systems, and will not leave out entirely.
TPA uses ansible sudoers_module which has no option to set EXEC (only a no-exec option).
We have created a sudoers file, which does work properly.
But whenever we run TPA, it creates a sudoers.d file, which breaks our sudoers config, after which it tries to start or restart the service, which does not work properly anymore, due to the broken sudo config.
Hello, thank you for submitting this issue, we will create an internal ticket to work on this subject and will update this issue once resolved.
Thanks @JonathanRenon-EDB .
I would like to elaborate a bit on our needs:
This proposes a generically usable fix, but to us it is a hard requirement. In it's current state, TPA is almost entirely unusable in our environment. Currently we deploy clusters, by a mix of running TPA 3 times, with intermittently a hotfix playbook to resolve this issue. And when running TPA again, it breaks sudo config again, which we currently undo with the post-deploy hook, but that is too late for restarts if efm config changes. Which means that efm config changes require us to do the same as deploy time (run hotfix playbooks 3 times with hotfix playboooks).
The solution I would consider would make it possible to skip this particular sudoers task in order to let you ensure nodes have the correct rule in place and simply avoid changes that would break the sudoers configuration. this should negate the need for a hook and ensure that TPA run is more in line with idempotency by avoiding efm role to make changes that you later need to overwrite in a hook adding at least 2 changed task in a run that basically contradict themselves on each subsequent runs.
this would use the excluded_task mechanism explained here, by adding a new tag that would skip the task discussed above.
adding the hook might still be part of the change (maybe not specific to efm), since hooks are easily added and can make sense at this point of the deployment.
I agree to your comment and proposal allowing to break it only to fix it in a later step is a bit clumsy. Want me to extend the pr? Or do you want to create one yourself?
you can go ahead an expand the PR if you have the free cycles and will to do so, we would still import the PR on our internal repo for review and integration testing before adding it to the next incoming release (currently planned for mid august)
@sebasmannem Let us know if you are willing to add this to your PR or if we should prioritize it on our side. This would probably mean that it might not make it to the next release but at a later point.