deepdiff `iterable_compare_func` does not seem to work on nested lists

Please checkout the F.A.Q page before creating a bug ticket to make sure it is not already addressed.

Describe the bug I cannot get the compare function to work properly on lists that are not 1st-level.

To Reproduce Define the two following objects. They are identical except that:

the version attribute went 0.0.0->0.0.1
in the second object has a number field removed in the numberfield_set list, and a new one added. UUID4s are used to track which one should be correlated with which, much like ids. in the example in the docs.

self_json= b'{"stringfield_set":[],"numberfield_set":[{"uuid":"fa0c87e8-01f5-43f5-8e63-24886f72ffd0","name":"field_1","address":1,"documentation":"","first_bit_offset":2,"size_in_bits":3,"is_signed":false,"is_lsb_left":false,"value_multiply_by":1.0,"value_divide_by":1.0,"value_increase_by":0.0,"value_unit":""},{"uuid":"332e5cfe-886d-41cc-b4d9-0fc1296b3ea0","name":"field_2","address":5,"documentation":"","first_bit_offset":5,"size_in_bits":6,"is_signed":false,"is_lsb_left":false,"value_multiply_by":1.0,"value_divide_by":1.0,"value_increase_by":0.0,"value_unit":""}],"enumfield_set":[],"version":"0.0.0","version_date":null,"version_commit":null,"ros_link":null,"documentation":"","definition":42,"information":[],"users":[]}'
other_json=b'{"stringfield_set":[],"numberfield_set":[{"uuid":"fa0c87e8-01f5-43f5-8e63-24886f72ffd0","name":"field_1","address":1,"documentation":"","first_bit_offset":2,"size_in_bits":3,"is_signed":false,"is_lsb_left":false,"value_multiply_by":1.0,"value_divide_by":1.0,"value_increase_by":0.0,"value_unit":""},{"uuid":"056429c5-812b-4f49-9aae-1b52bd40aacd","name":"field_3","address":7,"documentation":"","first_bit_offset":8,"size_in_bits":9,"is_signed":false,"is_lsb_left":false,"value_multiply_by":1.0,"value_divide_by":1.0,"value_increase_by":0.0,"value_unit":""}],"enumfield_set":[],"version":"0.0.1","version_date":null,"version_commit":null,"ros_link":null,"documentation":"","definition":42,"information":[],"users":[]}'

make the iterable compare func

def field_diff_function(x, y, level=None):
    try:
        return x["uuid"] == y["uuid"]

    except Exception:
        raise CannotCompare() from None

run the compare:

        diff_native = DeepDiff(
            json.loads(self_json),
            json.loads(other_json),
            iterable_compare_func=field_diff_function,
            ignore_order=True,
        )

the following diff is the result:

{
  "values_changed": {
    "root['numberfield_set'][1]['uuid']": {
      "new_value": "056429c5-812b-4f49-9aae-1b52bd40aacd",
      "old_value": "332e5cfe-886d-41cc-b4d9-0fc1296b3ea0"
    },
    "root['numberfield_set'][1]['name']": {
      "new_value": "field_3",
      "old_value": "field_2"
    },
    "root['numberfield_set'][1]['address']": {
      "new_value": 7,
      "old_value": 5
    },
    "root['numberfield_set'][1]['first_bit_offset']": {
      "new_value": 8,
      "old_value": 5
    },
    "root['numberfield_set'][1]['size_in_bits']": {
      "new_value": 9,
      "old_value": 6
    },
    "root['version']": {
      "new_value": "0.0.1",
      "old_value": "0.0.0"
    }
  }
}

Expected behavior As the fields should be matched using the uuid attribute, it should show that one has been added, and the other has been removed, and not that they changed.

OS, DeepDiff version and Python version (please complete the following information):

Python Version: 3.9.10
deepdiff version: 5.8.0

Additional context I'm definitely not ruling out there's a problem somewhere between the chair and the keyboard

Apr 29 '22 13:04 jvacek

Upon further inspection, slapping a print() statement inside the compare function shows that the function is actually never run.

Apr 29 '22 13:04 jvacek

The iterable_compare_func function is not called when you set ignore_order parameter to True from what I know

Jun 25 '22 13:06 LizardBlizzard

The iterable_compare_func function is not called when you set ignore_order parameter to True from what I know

Is this a bug or intended behaviour? It is not mentioned in the docs atleast.

I'm doing diffs on nested dicts with list of dicts which have id keys where iterable_compare_func helps comparing correct dicts to eachother, however the performance is not very good. Setting ignore_order=True helps performance alot but wrong dicts are compared sometimes since iterable_compare_func is not used.

Jan 17 '23 15:01 havardthom

I'm having the same issue and was about to create an issue... I also done some more testing and debugging...

I think there are actually 2 bugs in this:

This is exclusive to ignore_order=True. There's a len > 1 check that prevents the fuction from being called. only if there are two or more different items in both lists does the function actually get called. ie. If there's 1 addition and 1 removal it doesn't call the function to check whether they are pairs or not. I'm almost certain this is incorrect behaviour. I edited it locally and got this fixed (you can mimic by just introducing another element in both lists with differences). Instead of getting key 'id' changed from a to b, I'm now getting dict {'id': a} changed to dict {'id': b} ie. instead of considering k, v changes, the entire dicts are considered changes. https://github.com/seperman/deepdiff/blob/8ab1c8dbf19bb87177c10029a518051d6622532a/deepdiff/diff.py#L1094
Irrelevent of ignore_order. The result of the custom compare function is ignored for nested dicts I've tested this on a list of dicts, inside each of these dicts is a key whose value is also a list of dicts. each dict has an id key. The example below sets ignore_order=True but it works for both cases, just make sure the order is correct.

[
{'id': '1010', 'g': [ {'id': '2020'} ] },
{'id': '73', 'g': [ {'id': '101'}, {'id': '6790'} ] }
]

[
{'id': '73', 'g': [ {'id': '202'},  {'id': '15294'} ] },
{'id': '1012', 'g': [ {'id': '2020'} ] }
]

I use this as compare function. Except the prints, taken straight from the docs.

def compare_func(x, y, level=None):
    print(x)
    print(y)
    print('in')
    if not isinstance(x, dict) or not isinstance(y, dict):
        print('not dict')
        raise CannotCompare
    if x['id'] == y['id']:
        print('match')
        return True
    print('not match')
    return False

I found out that it gets called 8 times. 4 times comparing the 2 top level dicts to each other. 4 times comparing the 2 nested dicts to each other (the ones in the dict with the same id) the prints were as expected for both the top level and nested dicts. but the results are different... 'id': '1010' and 'id': '1012' are not pairs and appeared in iterable_item_added and iterable_item_removed accordingly. but nested dicts were paired with eachother and entered in values_changed DESPITE the function returning False for all pairing checks between them...

Mar 03 '23 17:03 Omar-Abdul-Azeez

I did some more debugging and I found out that the dicts are reported as iterable_item_removed and not as values_changed 2023-03-04-03-38-43 which led me to think the problem is in the this line https://github.com/seperman/deepdiff/blob/8ab1c8dbf19bb87177c10029a518051d6622532a/deepdiff/diff.py#L309 up to this point all differences found are under removed or added but once this function gets called it changes to changed for some reason... taking a quick look inside that function there's a suspicious mutual_add_removes_to_become_value_changes() call that makes me believe it doesn't respect the custom comparing function... I took a quick peek inside it and it seems to do exactly that. I think I need a break... https://github.com/seperman/deepdiff/blob/8ab1c8dbf19bb87177c10029a518051d6622532a/deepdiff/diff.py#L1551

Mar 03 '23 18:03 Omar-Abdul-Azeez

Hi @jvacek @Omar-Abdul-Azeez @havardthom @LizardBlizzard Somehow this ticket was lost among other tickets I was paying attention to. @Omar-Abdul-Azeez Thank you for diving into it already. I am going to take a look soon.

Nov 19 '23 16:11 seperman

I think I'm running into the same issue. I'm using a custom iterable_compare_func function to match on id. All the results BUT ONE are correctly reported either as iterable_item_added or iterable_item_removed. The single incorrect one is reported as values_changed with clearly a non matching id.

@seperman and @Omar-Abdul-Azeez have you been able to fix this? Thanks a lot.

Jan 09 '24 19:01 pascalstindt

deepdiff deepdiff copied to clipboard

`iterable_compare_func` does not seem to work on nested lists

deepdiff
deepdiff copied to clipboard