kubernetes.core icon indicating copy to clipboard operation
kubernetes.core copied to clipboard

k8s module: --diff-mode support only half works with multi-resource manifests

Open stmarier opened this issue 5 years ago • 5 comments

SUMMARY

We have a kustomize lookup plugin that renders a kustomize directory structure into a resource_definition for the k8s module. We are observing that when the k8s module manages multiple resources, --diff does not display inline diffs.

The same process returns inline diffs when the resource_definition is only a single resource.

After spending a bunch of time with this I have a fix committed to my employer's ansible fork, but I don't know where would be best for this bug report to live upstream.

The problem appears to be that inside raw.py we are setting diffs as follows:

match, diffs = self.diff_objects(existing, result['result'])
[...]
result['diff'] = diffs

This is ends up returning a result structure that looks something like this:

{
    "changed": true,
    "invocation": {},
    "result": {
        "results": [
            { "changed": false, "diff": {}, <etc> },
            { 
                "changed": true,
                "diff": {
                    "after": <resource spec post-change>,
                    "before": <existing resource spec>,
                    <etc>
                }
            },
            <other result items>
        ]
    }
}

Because our result structure is so complicated, it is causing problems in upstream task handlers, I found 2 and resolving them locally enabled us to start seeing friendly diff output regardless of resource count.

  1. default strategy plugin https://github.com/ansible/ansible/blob/devel/lib/ansible/plugins/strategy/init.py#L504-L514 The functionality here makes some potentially naive assumptions about the result dictionary:
            elif '_ansible_item_result' in task_result._result:
                if task_result.is_failed() or task_result.is_unreachable():
                    self._tqm.send_callback('v2_runner_item_on_failed', task_result)
                elif task_result.is_skipped():
                    self._tqm.send_callback('v2_runner_item_on_skipped', task_result)
                else:
                    if 'diff' in task_result._result:
                        if self._diff or getattr(original_task, 'diff', False):
                            self._tqm.send_callback('v2_on_file_diff', task_result)
                    self._tqm.send_callback('v2_runner_item_on_ok', task_result)
                continue

None of these conditions end up evaluating as True for our returned result structure:

  • our task_result._result has a results: [] list instead of a diff:, since we are pushing the diff key into each result item under results.
  • we have many possible results which each must be examined for a diff

Without an adjustment to either our return value or the upstream logic, our multi-resource tasks never fire the v2_on_file_diff callback, so our diff is only visible while running with -vvv which dumps the entire data structure to the display.

For some reason this works fine with single resource module invocations, which leads me to believe that ansible.module_utils.common.dict_transformations.recursive_diff might be handling that case nominally but not the multi-resource case,

  1. default callback plugin https://github.com/ansible/ansible/blob/devel/lib/ansible/plugins/callback/default.py#L274-L288
    def v2_on_file_diff(self, result):
        if result._task.loop and 'results' in result._result:
            for res in result._result['results']:
                if 'diff' in res and res['diff'] and res.get('changed', False):
                    diff = self._get_diff(res['diff'])
                    if diff:
                        if self._last_task_banner != result._task._uuid:
                            self._print_task_banner(result._task)
                        self._display.display(diff)
        elif 'diff' in result._result and result._result['diff'] and result._result.get('changed', False):
            diff = self._get_diff(result._result['diff'])
            if diff:
                if self._last_task_banner != result._task._uuid:
                    self._print_task_banner(result._task)
                self._display.display(diff)

Similar to the strategy plugin issue, even if we end up firing this callback, there are two issues:

  1. Even though we are returning a list of results, the module invocation is not in "loop mode", so the first conditional statement evaluates to False. The second conditional looks promising but it fails for the same reason as the strategy plugin: our diff key is in each result._result['results'] item, not the top-level result as configured here.

  2. The entire callback execution is wrapped in a try: statement that does a search through all possible callbacks, so our KeyErrors here are silently suppressed.

ISSUE TYPE
  • Bug Report / Feature Question
COMPONENT NAME

k8s.py, possibly any other modules that use diff_objects() inside module_utils/common.py.

ANSIBLE VERSION

tested on at least ansible 2.9.0, but the source issue is present in master and hasn't been touched for multiple years (generally)

EXPECTED RESULTS

Multi-resource k8s invocations, when run with diff mode, will output a pretty diff instead of the vvv required dense diff.

stmarier avatar Apr 22 '20 18:04 stmarier

The resolution I have committed locally is extremely brute-force and just extends all of these upstream conditionals to support the various ways result will appear. I'm not very familiar with this part of the codebase, but I have to wonder if there isn't a more graceful way we can handle passing result back to the task handlers that would work with the existing assumptions they make.

stmarier avatar Apr 22 '20 18:04 stmarier

I just recommend not doing that

Having a simple task loop over resource manifests means that you get a named task per resource with a proper diff (that works with ansible-playbook -Dvvv)

I've implemented this pattern (which we use in production) in the kube-resource role: https://github.com/willthames/ansible-role-kube-resource/blob/master/tasks/main.yml#L66-L68

willthames avatar Jun 17 '20 09:06 willthames

@stmarier Does @willthames's suggestion work for you here?

tima avatar Aug 11 '20 18:08 tima

No, it doesn't work for me here. The workaround I implemented in my employer's ansible fork has been working fine since April and we didn't have to fundamentally alter the way we were writing tasks.

This remains a bug, IMHO. I see that you've added a needs_info tag to this bug report. What do you want added to this issue?

stmarier avatar Aug 11 '20 20:08 stmarier

@stmarier is it possible for you to share the kustomize lookup plugin?

We're thinking of doing some manual work to get around this diff issue, but the kustomize lookup would be super useful

jam01 avatar Nov 22 '20 18:11 jam01

I think I am seeing this bug. I am applying a multiresource definition. However, I am not seeing diff out put when --diff and task is changed.

jobcespedes avatar Oct 22 '22 03:10 jobcespedes

I spent some time looking at this and while we could address this in the collection, there are limitations to what we can do. Given those limitations, I'm not convinced this would be a useful change. I have some changes locally that bring the diffs into a top level diff property which means ansible will print the diff as usual. The problem is the diff output does not have enough information to meaningfully connect the different diffs to the underlying resources. And any resources that did not change would not be in the diffs, making it even more difficult to connect changes to resources since there is no guarantee of a one to one mapping. For anything more than a handful of changes, the diff will be of little use. See for example this output:

TASK [kubernetes.core.k8s] ******************************************************************************************************************
--- before
+++ after
@@ -1,6 +1,8 @@
 {
     "metadata": {
-        "labels": {},
+        "labels": {
+            "one": "two"
+        },
         "managedFields": [
             {
                 "apiVersion": "v1",
@@ -9,13 +11,14 @@
                     "f:metadata": {
                         "f:labels": {
                             ".": {},
-                            "f:kubernetes.io/metadata.name": {}
+                            "f:kubernetes.io/metadata.name": {},
+                            "f:one": {}
                         }
                     }
                 },
                 "manager": "OpenAPI-Generator",
                 "operation": "Update",
-                "time": "2022-10-24T18:39:02Z"
+                "time": "2022-10-24T18:46:15Z"
             }
         ]
     }

--- before
+++ after
@@ -1,6 +1,8 @@
 {
     "metadata": {
-        "labels": {},
+        "labels": {
+            "three": "four"
+        },
         "managedFields": [
             {
                 "apiVersion": "v1",
@@ -9,13 +11,14 @@
                     "f:metadata": {
                         "f:labels": {
                             ".": {},
-                            "f:kubernetes.io/metadata.name": {}
+                            "f:kubernetes.io/metadata.name": {},
+                            "f:three": {}
                         }
                     }
                 },
                 "manager": "OpenAPI-Generator",
                 "operation": "Update",
-                "time": "2022-10-24T18:39:02Z"
+                "time": "2022-10-24T18:46:15Z"
             }
         ]
     }

changed: [localhost]

Compare this with the output of kubectl diff where the diff is matched to the kind/name of the resource:

diff -u -N /tmp/LIVE-022929964/v1.Namespace..foobar /tmp/MERGED-806872987/v1.Namespace..foobar
--- /tmp/LIVE-022929964/v1.Namespace..foobar    2022-10-24 13:52:02.170867129 -0400
+++ /tmp/MERGED-806872987/v1.Namespace..foobar  2022-10-24 13:52:02.170867129 -0400
<snip diff>
diff -u -N /tmp/LIVE-022929964/v1.Namespace..foobaz /tmp/MERGED-806872987/v1.Namespace..foobaz
--- /tmp/LIVE-022929964/v1.Namespace..foobaz    2022-10-24 13:52:02.182866996 -0400
+++ /tmp/MERGED-806872987/v1.Namespace..foobaz  2022-10-24 13:52:02.186866953 -0400
<snip diff>

kubectl can do this because it's just a blob of text where in ansible the return value is structured data. If having ansible print the diff for multiple resources is important, my suggestions would be to either do what @willthames suggested or create your own customized stdout callback plugin and make it look however you want.

gravesm avatar Oct 24 '22 19:10 gravesm

We were refactoring from individual resources to multi-resource one task definition, to improve performance and throughput(parallel reconciles) as well to reduce cpu, memory utilization. @willthames suggestion, while useful, does not help for that.

jobcespedes avatar Oct 24 '22 19:10 jobcespedes

Given the difficulties in implementing this in a way that will be useful, I'm going to close it. We'll update the documentation to highlight the limitations of using multiple resource definitions with diff in #536

gravesm avatar Oct 27 '22 14:10 gravesm