openlibrary icon indicating copy to clipboard operation
openlibrary copied to clipboard

Reversing author merges also reverts editions and works

Open seabelis opened this issue 2 years ago • 2 comments

Occasionally, I notice that authors were merged incorrectly. I can sometimes reverse the merge, but have noticed recently that a consequence of this is that edits made to related works and editions are reverted as well and without warning. It's not expected that reversing an author merge would affect non-author edition or work data.

Evidence / Screenshot (if possible)

Relevant url?

For example, these are all different people and this merge should not have happened. It is desirable to reverse the merge so that the proper works are restored to the proper authors/editors. However, the danger in this is that any edits made to any of the associated works will be reversed as well, even though those edits are not related to author data. https://openlibrary.org/recentchanges/2010/11/26/merge-authors/42049802?page=1

Here you can see that the title, description, cover, and subjects were all lost after reversing an author merge. https://openlibrary.org/works/OL14863262W/Tales_of_Soldiers_and_Civilians_18_stories?_compare=Compare&b=26&a=24&m=diff

Steps to Reproduce

  1. Go to ...an author that was merged incorrectly
  2. Do ...reverse the merge
  • Actual: Works and editions are also reverted.
  • Expected: Only authors should be reverted.

Details

  • Logged in (Y/N)?
  • Browser type/version?
  • Operating system?
  • Environment (prod/dev/local)? prod

Proposal & Constraints

Related files

Stakeholders

@mekarpeles @cdrini @jimchamp

seabelis avatar Apr 04 '22 09:04 seabelis

Another merge that should be reversed. https://openlibrary.org/recentchanges/2011/04/18/merge-authors/43308762

seabelis avatar Apr 04 '22 17:04 seabelis

What's going on here is the author merge doesn't actually record the specific changes made to works during the merge, it just makes the change and notes the new revision number of the work. All that the merge history records is that these authors and these works were changed, and these are the current revision numbers for each of those objects.

When you request an undo, the process looks at that history, takes each object (authors and works) and reverts them to the revision immediately before the one that's recorded.

However, if, after the author merge, someone comes along and makes other edits to any of those records, the undo process will still revert them all to the revisions that existed immediately before the merge, wiping away any subsequent edits. That's true for both the works and the authors. The same issue will exist with regard to work merge undo when that is deployed.

The only way I can see around this would be to make the undo process significantly smarter, so that it could compare the revision numbers in the merge history with the current revision numbers of each object, and if they are different, capture the changes applied since the merge revision and reapply them after reverting to the pre-merged versions.

I think that's doable, but it's a pretty major change to how undo works, so I'd want some input from @cdrini before doing any work in that direction.

mheiman avatar Sep 22 '22 18:09 mheiman

@mheiman I agree I think we need to make that process smarter. It should:

  1. Find the revision number it wants to rollback to. Say that's reversion 7
  2. Find the diff in the JSON records between revision 7 and 8.
  3. Attempt to apply the inverse of that diff to the latest version of the document
  4. If it succeeds, save the record. If it fails, display an error.

There are libraries to do this sort of JSON algebra; not sure which is best:

  • https://github.com/xlwings/jsondiff
  • https://github.com/seperman/deepdiff

cdrini avatar Nov 20 '23 15:11 cdrini