deepdiff
deepdiff copied to clipboard
Unexpected Repetition of Elements when Generating Delta Between Dictionaries
Please checkout the F.A.Q page before creating a bug ticket to make sure it is not already addressed.
Describe the bug When using the deepdiff library to discern the differences between two dictionaries and generate a delta, an unexpected repetition of elements occurs.
To Reproduce
from deepdiff import DeepDiff, Delta
d1 = {'a': [{'id': 1}, {'id': 2}, {'id': 3}]}
d2 = {'a': [{'id': 1}, {'id': 2}, {'id': 3}, {'id': 4}]}
deep_diff_result = DeepDiff(d1, d2, exclude_regex_paths=[r"(?=root.*\['id'\])"], ignore_order=True, report_repetition=True)
result = d2 + Delta(deep_diff_result)
print(result)
- Take two dictionaries:
d1 = {'a': [{'id': 1}, {'id': 2}, {'id': 3}]}
d2 = {'a': [{'id': 1}, {'id': 2}, {'id': 3}, {'id': 4}]}
- Use the following code to compare them:
from deepdiff import DeepDiff, Delta
deep_diff_result = DeepDiff(d1, d2, exclude_regex_paths=[r"(?=root.*\['id'\])"], ignore_order=True, report_repetition=True)
- Check the output:
{'repetition_change': {"root['a'][0]": {'old_repeat': 3, 'new_repeat': 4, 'old_indexes': [0, 1, 2], 'new_indexes': [0, 1, 2, 3], 'value': {'id': 1}}}}
- Apply the delta to d2:
result = d2 + Delta(deep_diff_result)
print(result)
Expected behavior I anticipated the only difference between d1 and d2 to be the {'id': 4} entry.
OS, DeepDiff version and Python version (please complete the following information):
- OS: macOS
- Version Ventura 13.4.1
- Python Version 3.9.0
- DeepDiff Version 6.3.0
Additional context The result produced was {'a': [{'id': 1}, {'id': 1}, {'id': 1}, {'id': 1}, {'id': 2}, {'id': 3}, {'id': 4}]}, which had four repetitions of {'id': 1}.
I've not found any similar issue on Stack Overflow, and I've reviewed open and closed issues on the deepdiff GitHub repository without identifying any similar scenarios.
Related Research: I've looked through Delta Documentation, but it didn't provide clarity for this particular case.
Thanks in advance
Hi @kfirc
Thanks for reporting this.
What is happening here is that exclude_regex_paths
is not working properly with report_repetition
:
In [1]: from deepdiff import DeepDiff, Delta
...:
...: d1 = {'a': [{'id': 1}, {'id': 2}, {'id': 3}]}
...: d2 = {'a': [{'id': 1}, {'id': 2}, {'id': 3}, {'id': 4}]}
...:
...: deep_diff_result = DeepDiff(d1, d2, exclude_regex_paths=[r"(?=root.*\['id'\])"], ignore_order=True, report_re
...: petition=True)
...:
In [2]: deep_diff_result
Out[2]:
{'repetition_change': {"root['a'][0]": {'old_repeat': 3,
'new_repeat': 4,
'old_indexes': [0, 1, 2],
'new_indexes': [0, 1, 2, 3],
'value': {'id': 1}}}}
In [3]: deep_diff_result = DeepDiff(d1, d2, ignore_order=True, report_repetition=True)
In [4]: deep_diff_result
Out[4]: {'iterable_item_added': {"root['a'][3]": {'id': 4}}
What delta object gets in your case is that {'id': 1}
needs to be repeated 4 times. That's why you get the unexpected result.