ldif-diff creates entries in wrong order (add vs. modify)
ldif-diff is really a great tool to preview semi-automated changes before sending the changes to the LDAP server. There's one problem though I ran into. When the diff contains multiple types of changes, e.g. a group should be created (add) and that group should become a member of other groups (modify) then the modifications that add the new group as a member to the existing group is done before creating the group resulting in an error.
Here's an example output (manually modified to prevent internal data leakage):
$ java -jar lib/unboundid-ldapsdk-5.0.2-2020-05-30.jar ldif-diff --sourceLDIF reports/TSTX-ad-export.ldif --targetLDIF reports/TSTX-all.ldif
# Entries read from source LDIF file 'exported.ldif': 20
# Entries read from target LDIF file 'new.ldif': 21
dn: CN=some-existing-group,OU=Groups,DC=example,DC=com
changetype: modify
add: member
member: CN=some-new-group,OU=Groups,DC=example,DC=com
-
dn: CN=some-new-group,OU=Groups,DC=example,DC=com
changetype: add
objectClass: group
objectClass: top
groupType: -2147483640
instanceType: 4
cn: some-new-group
sAMAccountName: some-new-group
description: some-new-group
# LDIF diff processing completed successfully.
# Add count: 1
# Delete count: 0
# Modify count: 1
It would be great to generate the add operations before modifications to enable fully automated usage.
Regards, Andreas
That's a good point, and thanks for reporting it. The best order would probably be to report adds first, then modifies, and finally deletes. That would allow modifies to reference entries that have just been added, and also to allow them to remove references to entries that are about to be deleted.
I've just committed a change that causes the tool to write adds first, then modifies, and then deletes.
Note that this does not completely solve the problem in all circumstances, because it's possible that the set of adds could include both a new user and a new group that references the new user. If the group is added before the user, then you'll have the same problem. The tool uses the DN class as a comparator to determine the order in which it writes add operations. It sorts by hierarchy first, so that parents will be ordered before children, but entries below the same parent will be ordered lexicographically by RDN.
But I don't think that this is a problem that ldif-diff needs to solve on its own, and it may not even be possible to solve in all cases anyway because it's possible that there could be a circular reference (e.g., if you add one entry, then add a second entry that references the first, and then modify the first entry to reference the second) that prevents this from working in any order.
Wow, that was fast - fixed an hour after reporting it :-). Thanks for fixing it - I can confirm that it's now working for my use cases as expected.
I had this problem too that new groups may reference members that are not yet created and I solved that by splitting the LDIF entries into two parts:
- Creating all new entries (without adding members)
- Modify entries (both just created or already existing) by adding the members afterwards
Maybe ldif-diff can take the same approach to be more robust in scenarios like that.
I’m not going to do that because it would add complexity for a niche problem (especially given that many directory servers don’t enforce that kind of referential integrity, or at least not in their default configuration), and some could see it as undesirable behavior because it’s increasing the number of operations required to update one data set to look like another, and because it also makes it more complicated to look at the output to see what the differences are because they may be spread out in a couple of places.
Further, it would also likely trade one problem for another. I’m not familiar with Active Directory and a lot of its specifics, but some types of groups don’t allow you to create them without members. In particular, the groupOfNames object class (defined in RFC 4519 section 3.5) requires that the member attribute be present, and the groupOfUniqueNames object class (same specification, next section) requires that the uniqueMember attribute be present. Some servers have relaxed definitions for those groups that make it possible to create those groups without members, but a strictly compliant server would reject attempts to create groups without members.
As I mentioned before, the best approach I can think of to addressing the problem would be to maintain a rejects file when applying changes, and then re-try any rejected operations to see if they are accepted on a second attempt.
I'm fine with that since my original use case is already working :-).