integreat-cms icon indicating copy to clipboard operation
integreat-cms copied to clipboard

Delete all but last 3 auto-saves

Open mael-fosso opened this issue 1 year ago • 15 comments

Short description

Delete all but the last 3 autos-saves and renumber all affected versions to be continuous.

Proposed changes

  • we implemented a static method cleanup_auto_save(). This method clears all auto-save except the last three (per default).
  • Then we used the bulk_update() method to update the version of the pages.

Side effects

  • when we renumbering auto-save bulk_update() we previously got IntegrityErrors , probably because postgresql enforced unique constraints before the update statement was finished. We implemented the workaround using individual Data base statement in a For-loop, but revisiting this now we cannot reproduces this. We reverted to bulk_update() but please test this thoroughly to sure this doesn't break.

Resolved issues

Fixes: #2067


Pull Request Review Guidelines

mael-fosso avatar Apr 09 '23 18:04 mael-fosso

Code Climate has analyzed commit 558df669 and detected 0 issues on this pull request.

The test coverage on the diff in this pull request is 96.5% (50% is the threshold).

This pull request will bring the total coverage in the repository to 82.3% (0.0% change).

View more on Code Climate.

codeclimate[bot] avatar Apr 09 '23 18:04 codeclimate[bot]

Maybe a changelog entry? 🙂

MizukiTemma avatar Apr 13 '23 15:04 MizukiTemma

Right, that one thing that never comes to mind after getting everything to work...

@mael-fosso You can find all the information you need here in the docs, if you want to try it on your own. It's pretty straight forward!

PeterNerlich avatar Apr 13 '23 18:04 PeterNerlich

A question for the issue itsself: max. 3 auto save versions in all over version history or between 2 nearest manual saves?

Example: publish - auto save - publish - auto save - publish - auto save - auto save - publish

Should all 4 auto saves remain there in this case because there is no more than 3 auto saves in each manual save interval?

The current code keeps up to 3 auto save versions in all over version history (the first auto save (version 2 in the example) will be deleted). Though I don' think the current behaviour harms anything because all manual saves stay there, if max 3 in each interval is explicitly wished we have to change how to calculate the earlist and affected_versions.

MizukiTemma avatar Apr 14 '23 11:04 MizukiTemma

The current code keeps up to 3 auto save versions in all over version history

Yes, that is how we understood the issue description. In my opinion, this also makes most sense, since the motivation is to cut down on auto saves nobody cares about anymore that clutter the UI as well as DB space. I imagine auto saves to only be relevant when they are very recent, and mostly useless when sandwiched between old Draft or Published versions. I don't expect users to want to view or restore them in the latter case.

PeterNerlich avatar Apr 14 '23 14:04 PeterNerlich

Hey @mael-fosso and @PeterNerlich, thanks for tackling this issue! 🙂

Should all 4 auto saves remain there in this case because there is no more than 3 auto saves in each manual save interval?

I think one could argue, that an autosave which contains some relevant changes for a page, will be published or explicitly saved as draft at some point, therefore I guess it makes sense to leave only 3 auto-saves overall.

I know, that this was part of the issue requirements, but from a UX perspective I'm not sure about the change of version numbers as I can imagine, that users will refer to them and for a user it might not be very obvious now, when and why version numbers will be squashed. Not sure though, how much of an issue this is in reality.

Also I realized, that auto saves will not be reduced to a total of 3 when using the "Restore and Publish" button in the page revision view. I would have expected this behaviour here as well.

membralala avatar Apr 17 '23 11:04 membralala

I also got the IntegrityError when there are gaps of non-deleted versions between a bunch of auto saves (e.g. [draft, autosave, draft, autosave, autosave, autosave, autosave])

Okay, so we still do have this issue.

We could solve this by setting deferrable=Deferrable.DEFERRED, but I think this can only be done globally (not only for this method), and comes with performance penalties, so I would suggest to leave it in a loop and don't use bulk_update in this case.

Well, it has to be set on the constraint, in the model. It would not affect any other UNIQUE constraints, only the one of the page-language-version combination. I'm unable to judge a) how great the performance impact on transactions involving a deferred constraint actually is and b) where in the cms / how often overall such a transaction occurs.

Assuming this is only happening every time someone drafts or publishes some page content, with every version renumbering bulk_action() call involving 4 DB entries on average (3 auto saves and one Draft/Published – all old autosaves have already been deleted at that point, and I think this is not including the newly generated one from clicking Draft/Publish), and if deferring the constraint would make all transactions take twice as long, then I think it might be worth it...? Obviously, I have no idea what any of those averages actually are for us. Also, I can't believe making 4 individual DB calls instead would be much more performant.

PeterNerlich avatar Apr 20 '23 07:04 PeterNerlich

Well, it has to be set on the constraint, in the model. It would not affect any other UNIQUE constraints, only the one of the page-language-version combination. I'm unable to judge a) how great the performance impact on transactions involving a deferred constraint actually is and b) where in the cms / how often overall such a transaction occurs.

Assuming this is only happening every time someone drafts or publishes some page content, with every version renumbering bulk_action() call involving 4 DB entries on average (3 auto saves and one Draft/Published – all old autosaves have already been deleted at that point, and I think this is not including the newly generated one from clicking Draft/Publish), and if deferring the constraint would make all transactions take twice as long, then I think it might be worth it...? Obviously, I have no idea what any of those averages actually are for us. Also, I can't believe making 4 individual DB calls instead would be much more performant.

Yes I agree, for this specific use case the deferrable constraint would definitely be more performant than the loop, but I'm not completely sure about side effects on the other views. I'd just argue that for the user, the difference between 4ms and 1ms is not really noticeable, so I think it's not worth it evaluating the performance issues with deferrable constraints at this point in time. At least with the loop, we know that the performance is not ideal, but also not a big problem.

timobrembeck avatar Apr 20 '23 09:04 timobrembeck

@mael-fosso @PeterNerlich what's the state of this PR? Please re-request review if all requested changes have been addressed.

timobrembeck avatar Jul 17 '23 16:07 timobrembeck

The state is: Waiting for Mael to finish his exams for this semester and to find time for continuing this. Mizuki said today their exam period was still going for two weeks, so I hope we'll be able to resume work here in 2–3 weeks.

PeterNerlich avatar Jul 17 '23 21:07 PeterNerlich

@mael-fosso Do you want to continue working on this PR?

timobrembeck avatar Nov 08 '23 17:11 timobrembeck

Oh, you're right. I think something similar also happens if you press the button "Publish auto save" in the revision view. I assume that also if you click on that button the saves should be squashed. Is my assumption correct or am I missing something?

JoeyStk avatar Feb 14 '24 15:02 JoeyStk

@JoeyStk Yes, the same behaviour in the revision. For consistent behaviour it's better to adjust the bulk action and version restoration so there are max. auto save versions just like after normal manual save.

MizukiTemma avatar Feb 14 '24 19:02 MizukiTemma

@JoeyStk I tried cleanup_autosaves in the publishing bulk action and found the following problem:

  1. Create a brand new page and trigger more than 3 times auto save.
  2. See the translation status in the tree shows "up to date" (because the first auto save is not minor edit).
  3. Publish the page translation by bulk action
  4. See the auto save versions are squashed but the translation status in the tree shows "missing"

The cause is that the bulk action inherit "minor edit" status from the lastest auto save and the first auto save (not minor edit) is gone by squashing.

Possible solutions are:

  1. Leave it so (but it's confusing with the translation status in my opinion)
  2. No squashing for bulk action (not consistent with normal publishing)
  3. Change the behaviour of bulk action so it always publishes as non minor edit (Is change of existing behaviour acceptable?)

So far as I read in #2337 and #2341 it seems there was no specification that the status of minor edit must be inherited from the latest version, and the option 3 looks the easiest for me 🤔

I'm glad to hear your opinion :)

MizukiTemma avatar Feb 19 '24 10:02 MizukiTemma

The cause is that the bulk action inherit "minor edit" status from the lastest auto save and the first auto save (not minor edit) is gone by squashing.

Possible solutions are:

  1. Leave it so (but it's confusing with the translation status in my opinion)
  2. No squashing for bulk action (not consistent with normal publishing)
  3. Change the behaviour of bulk action so it always publishes as non minor edit (Is change of existing behaviour acceptable?)

I think there is a fourth option, and it's the one that feels the most natural to me, because it keeps what I interpret to be the intention of the current system intact:

  1. Ensure in cleanup_autosaves that if any of the deleted versions were not a minor edit, the oldest kept version is set to be not a minor edit either.

This is not perfect, actually. We need to tweak this a bit, so cleanup_autosaves produces the proper results from any valid database state. We need to limit the search for a version not set as minor edit to the last chain of consecutive auto saves to delete that is also immediately adjacent to the oldest auto save to keep:

… → auto (major) → Manual → auto → auto → auto → auto

should, when publishing a new version, turn into

… → Manual → auto → auto → auto → Public

and not

… → Manual → auto (major) → auto → auto → Public

PeterNerlich avatar Feb 23 '24 13:02 PeterNerlich

I mostly agree with @PeterNerlich on this. However, if I'm not mistaken, there is only one possibility where an auto save can be a major version: When it's the very first version. In all other cases, an auto save will always be a minor edit: https://github.com/digitalfabrik/integreat-cms/blob/204a66f9da45d906a7471baa8fb94ca580718563/integreat_cms/cms/views/pages/page_form_view.py#L380-L392

So I think we only need to do one thing: After squashing all auto saves, we have to check whether the very first version is still an auto save, and if so, we have so set minor_edit to False. Then, we should be good to go.

timobrembeck avatar Jun 02 '24 10:06 timobrembeck

@timobrembeck and @MizukiTemma this PR is finally ready for review again :) Please note that this PR only attempts to fix the first checkbox of the requirements :)

After squashing all auto saves, we have to check whether the very first version is still an auto save, and if so, we have so set minor_edit to False. Then, we should be good to go.

Please find the according changes in Ensure first version is not a minor edit after cleanup

JoeyStk avatar Jun 02 '24 14:06 JoeyStk