calcurse icon indicating copy to clipboard operation
calcurse copied to clipboard

[Feature request] Option to toggle an automatic merge on save

Open TripleTrable opened this issue 5 years ago • 31 comments

The periodic save function is kind of useless with calDAV sync because after a sync, you need to merge the new files with the buffered state. The periodic save function cancels the save if it detects that it is a periodic save. Please correct me if I am wrong.

TripleTrable avatar Jul 15 '20 10:07 TripleTrable

This should only happen if you modify the calendar both locally with calcurse and on the server, without syncing in between. The hooks could easily be changed to warn you (or launch an external merge tool) if this happens.

lfos avatar Jul 18 '20 12:07 lfos

But I can run a hook only before or after a save. If Calcurse detects a change the pre-save hook isn't run. And if I replace ret = IO_SAVE_CANCEL in io_save_cal() in the periodic save if with io_merge_data(), io_load_data() and ret = IO_SAVE_RELOAD the merge works, but the interface gets corrupted (no text in the sub windows and strange uft-8 chars).

TripleTrable avatar Jul 20 '20 06:07 TripleTrable

You should use the post-save hook. You can also trigger a calcurse reload from there if that helps?

lfos avatar Jul 20 '20 10:07 lfos

How do I trigger a reload/redraw from the hook? I couldn't find a hint in the man pages

TripleTrable avatar Jul 20 '20 12:07 TripleTrable

The Problem is that after I run the post-save hook in your example directory everything except the status-bar is gone. I need to manually trigger a redraw

TripleTrable avatar Jul 20 '20 13:07 TripleTrable

I'll close this issue because it looks like i have the same probelm as #309.

TripleTrable avatar Jul 21 '20 06:07 TripleTrable

The periodic save function is kind of useless with calDAV sync because after a sync, you need to merge the new files with the buffered state. The periodic save function cancels the save if it detects that it is a periodic save.

There are good reasons for that. At present only a manual (interactive) merge is possible. This cannot be handled from periodic save; however, it does warn you (by a system event) that the save operation was cancelled. You will then have to save interactively and are guided through the merge process (by default with vimdiff, not the most user friendly tool).

Please correct me if I am wrong.

I believe you are right, but an automatic merge is a major effort. Your contribution is appreciated.

lhca avatar Jul 21 '20 07:07 lhca

I'm trying to find some time to implement it.

TripleTrable avatar Jul 21 '20 08:07 TripleTrable

A few additions:

  • The merge tool can be configured with the MERGETOOL environment variable.
  • If we want to attempt an automated merge and only show a merge tool as a fallback, I'd strongly recommend trying to use an external tool for that too. It's unlikely that we can do much better than that here.

lfos avatar Jul 21 '20 10:07 lfos

How do I trigger a reload/redraw from the hook? I couldn't find a hint in the man pages

A reload of the disk file is triggered by sending a SIGUSR1 signal to calcurse. However, you end in the same interactive merge case.

If you're willing to experiment, try setting prepare_wins = 0 in hooks.c. That should eliminate the deadlock. A proper fix is coming up shortly.

lhca avatar Jul 21 '20 14:07 lhca

If I am right, I only need to implement a new hook (for no changes on save) and then merge after that, because if I only sync on post-save and pre-load the files aren't change between saves where it is buffered. The new thing is that after the no-save/nothing-save hook it needs to merge and recompute the hashes (no save because the buffer wasn't changed). I think my fix is too simple, if it's not a PR is incoming.

TripleTrable avatar Jul 22 '20 07:07 TripleTrable

Do you know how to send the SIGUSR1 from the save thread to the main thread ? to redraw the appointments. The appointments are random characters, probably because the reload happens in another thread.

TripleTrable avatar Jul 22 '20 08:07 TripleTrable

Threads in a program do not communicate by signals, and they don't need to. A signal handler is shared by all threads of a program, and if you look at the signal handler for SIGUSR1, you will see that it only sets a global variabel (want_reload) to 1. Global variables are shared by all threads. So the periodic save thread can just set want_reload to 1. The only thread that reads it, is the main thread.

But care is needed here. The main thread is the one that handles the (interactive) user interface. The purpose of the SIGUSR1 signal is to trigger a reload from another program (script), almost as if the user had pressed the reload key (^R by default). What will happen in your case I don't know because, frankly, I don't understand your explanation.

lhca avatar Jul 22 '20 21:07 lhca

Sorry if my question is hard to understand. The problem is that after a data-reload in the psave thread, I need to redraw the window to show the new data. If I redraw in the psave thread or use ungetch to trigger wgetch in the main thread, the name of the assignments become random ASCII chars (probably because of different threads). Do you have any Idea why that is? If that is solved, then maybe, there is no need for an auto merge feature.

TripleTrable avatar Jul 24 '20 06:07 TripleTrable

... The problem is that after a data-reload in the psave thread, I need to redraw the window to show the new data. ... If that is solved, then maybe, there is no need for an auto merge feature.

Before one solves a problem, one must understand it. Not to discourage you, but I fear you have misunderstood the way calcurse works and are trying to solve the wrong problem. Why do you think that an atomatic merge can be avoided?

An automatic merge is a replacement for the interactive merge. What, then, do you think the interactive merge accomplishes?

lhca avatar Jul 24 '20 10:07 lhca

Briefly about save and load in calcurse.

After interactive start of calcurse, the information (appointments, events and todos) is represented by two data sets: the disk files apts and todo and the in-memory data structures displayed on the terminal screen. The disk files are text files with one item per line. The in-memory data are various variables, arrays and linked lists.

When data are saved, the in-memory data are written to disk and replace the disk files. When data are loaded, the in-memory data are replaced by the contents of the disk files. But calcurse keeps track of changes to either of the two data sets and warns you if a load or save operation would cause data loss. If in-memory data have changed (for example because the user added an event) calcurse warns you and lets you choose what to do. Similarly, if the disk files have changed (for example because calcurse-caldav imported an appointment).

In both cases there are three possibilities:

  • go ahead and load from or save to disk as the case may be
  • merge the two data sets
  • cancel the operation

In you go ahead, you either loose in-memory changes (on load) or disk files changes (on save). If you choose to merge, the in-memory data set is written to a separate file called apts.new and an interactive mergetool (by default vimdiff) is started with apts and apts.new as arguments. You can edit the data in the apts file to your liking, save it and close the mergetool. Afterwards the edited apts file is loaded and replaces the in-memory data.

If periodic save detects a change to the disk files, it cancels the save and issues a system event. The problem we are discussing is how to let periodic save (or a user-initated save) perform a merge without user interinvention (an automatic merge).

lhca avatar Jul 25 '20 11:07 lhca

For non-interactive merges, we should probably also keep track of the diffbase (i.e. a snapshot of the data files at the time they were loaded in calcurse and before external modifications were made). We could then have a look at existing tools to perform the merge. For example, there's git-merge-file(1) which attempts to perform the merge automatically, marks any conflicts, and indicates with an exit status whether the merge was successful or an interactive merge is required. There might be other more generic tools out there but I did not yet do any extensive research.

lfos avatar Jul 26 '20 15:07 lfos

But first of all we should discuss and agree on the meaning of "merge". What should be the result of an automatic merge?

Because the two files are text files, my immediate thought would be: the union of the lines in the two files. All the traditional Unix tools are then available. Maybe the merge is just the output of

cat apts apts.new

or

sort -u apts apts.new

The latter would imply that duplicated items are eliminated. Is that acceptable? Or even desirable?

lhca avatar Jul 26 '20 17:07 lhca

Sorry for writing late. lfos, yes you're right. I have misunderstood the usage of the auto-merge ( I've thought of it not as a replacement but an extra feature. I haven't seen vimdiff at all. After initiating the merge, It all happens without interaction by me). Now I've understood the problem more or less.

Why do you think that an atomatic merge can be avoided?

I thought an auto merge wasn't necessary if calcurse handles the sync events. For example you could save the RAM stored data, then sync and after that reload. With that there wouldn't be a situation were the disk data is newer than the RAM stored data.

I don't understand what you mean by keeping tack of the diff base. Do you want to keep the changes over a long time (like an incremental backup) or do you want to use it just for this one specific merge.

@lhca :

In my opinion the removal of duplicates is desirable and cat or sort remove the problem of a failed merge but the question is how you would detect deleted appointments. A diff tool uses the date the file changed to see which is newer. Sort doesn't do it if i am correct. Without this we cannot delete an appointment because the 2 files are only concatenated (and sorted).

For example: I have a sync server which syncs my phone with my laptop. If I delete an appointment on my phone, it is synced to the server and then to the laptop. Now a merge needs to take place. (The appointment is still in memory and valid on my laptop.)

We take the apts and apts.new and use sort. bBoth files are concatenated and sorted (the duplicates are removed).

Because the appointment is only deleted in one file (apts), after concatenating the appointment is taken from the other file (apts.new). That is the problem. We cannot delete an appointment outside of

For better understanding an example file:

Pre-sync:

apts:
appointment 1
appointment 2

apts.new (memory):
appointment 1
appointment 2

Post-sync:

apts:
appointment 2

apts.new (memory):
appointment 1
appointment 2

after sort -u:

apts:
appointment 1
appointment 2

Maybe we should use the UNIX tools diff and patch.

TripleTrable avatar Jul 27 '20 07:07 TripleTrable

Anything based on sorting and/or duplicate detection is not a good idea. @TripleTrable pointed out one potential problem. Another one is an item that is edited both remotely and locally but in different ways (we don't want the item be duplicated here). This latter case should almost certainly not be handled by an automated merge, we should catch this case and require manual interaction (or show an error from periodic save). A merge that is always fully automatic is probably not desirable.

This is a somewhat complex problem and people have though about this before and created tools to solve this. Thus my suggestions to understand and leverage these tools instead of thinking about our own merge strategy.

The tools usually assume that we start with a common file A and create two different branches B, C of that file. We then want to merge B, C back into a common file A'. One basic observation is that this cannot be done reasonably with just B and C, the original A is needed too (which I called diff base). The example of editing an item in A in two different ways is a good example of why this is needed. With just B and C we cannot tell whether these two different items are originating from the same item in A or are two totally independent new items.

The fact that in calcurse items move to different lines when changing the date/time has to be taken into consideration and might complicate things a bit (but we might still be fine).

lfos avatar Jul 27 '20 20:07 lfos

@lfos, @TripleTrable: Thanks very much for contributing to my edification. This is new to me, hence my somewhat naive approach. I was more right than I knew when I wrote that "an automatic merge is a major effort".

lhca avatar Jul 28 '20 05:07 lhca

My idea would be to copy the apts file after each save to apts.old. With that we have a diff base and can merge the memory state from calcurse (apts.new) and any changes in apts (for example from caldav sync). In my implementation I would use Unix diff and patch. Do you have any ideas and or objections regarding this?

TripleTrable avatar Jul 28 '20 06:07 TripleTrable

Is it maybe desirable to only merge? I think of changing the io_save_cal to always do a merge, except if there are no file changes

TripleTrable avatar Jul 28 '20 08:07 TripleTrable

The question is what happens if the merge partially fails. Do we simply fall back to an interactive merge (and an error message for periodic save)? The merge tool I described above will perform a partial merge, mark the problematic areas in the file, and open an editor for the user to take care of these problematic areas.

I think I do not follow your latest comment "Is it maybe desirable to only merge?" fully; could you please elaborate?

lfos avatar Jul 28 '20 11:07 lfos

To maybe answer your question, I think this feature should be optional for the foreseeable future and configurable using an option, as stated in the title of this issue. The default behavior would be the current one (with improvements for handling periodic save).

lfos avatar Jul 28 '20 11:07 lfos

What I mean with only merge is that in the io_save_cal we only check if data has changed from the apts.old and if yes we merge and if no we cancel the save. Basically the "normal" save would be a merge too.

I think we should do a partial merge and send an error message. With that the user can choose what to do. I don't think the idea of opening an editor is good. The user should open the file and not Calcurse. Maybe we could add a setting where the user chooses what happens on an merge conflict. With option for "manual handling", "fallback to old", "use calcurse data set" or "use remote data set" (apts). Do you have an idea how often a "bigger" merge conflict happens?

TripleTrable avatar Jul 29 '20 06:07 TripleTrable

Assuming that you'd take the diff between the old version and the external new version and apply the diff to the new local version, that merge would be a no-op right? That's not a bad thing but this should be easy enough to catch such that we don't need to call any external program.

How often a "big" merge conflict happens highly depends on the workflow of the specific user and it's impossible to make a general statement. As to how often it happens globally, I have no idea.

We should definitely discuss some more details before this gets implemented.

What does partial merge mean? How are we going to explain the partial merge to the user? By default, patch(1) will report failed hunks in a .rej file which is not very user friendly. Would performing the partial merge from remote to local, then showing just the patched local alongside the remote be a good option? If we don't have a good presentation, I think it may even better to not merge at all and let the user to the merge all by themselves.

Do we require a specific version of patch(1)? Which environment variable will we use to make the binary configurable?

Why do you think having the user open the editor (or another merge tool) themselves is better than doing it from calcurse (with the exception of periodic save)?

What happens if a non-default data file location is chosen; do we always append .old or does the .old file end up in the normal data directory?

lfos avatar Jul 29 '20 10:07 lfos

What happens if a non-default data file location is chosen; do we always append .old or does the .old file end up in the normal data directory?

I would suggest we do the same as with the .new files. we simply append the .old to the file path.

Why do you think having the user open the editor (or another merge tool) themselves is better than doing it from calcurse (with the exception of periodic save)?

If your are doing something in calcurse and an auto merge creates a conflict, it would be unproductive to disturb the workflow with a popup or a question what to do. There should be only a visual indication that the merge failed. Then the user can merge outside of calcurse or use a shortcut to open the merge editor.

How are we going to explain the partial merge to the user?

A simple solution, but not a pretty one, would be to take the differing appointments and simply add them to the rest. Now the user Has both and he/she can decide which one to delete.

Your Idea to show the patched local and the remote is really good. But how do you plan on showing this? Do you want to show both alongside each other (like visual studio or vs code) or do you want to show them in your current layout with different colors.

Which environment variable will we use to make the binary configurable? I would suggest we store a string containing the binary config in the calcurse config.

Do we require a specific version of patch(1)? Depending on our next steps I would suggest to change from diff and patch to git merge-file. With git merge-file, the first idea to just add both conflicting items is really easy to implement with it's --union flag.

TripleTrable avatar Jul 31 '20 09:07 TripleTrable

Issues with periodic save and hooks (#309) are hopefully solved (#310) and patches available in the pu branch. I would like to recapitulate and review the case for an automatic merge and @TripleTrable's claim that periodic save is useless:

The periodic save function is kind of useless with calDAV sync because after a sync, you need to merge the new files with the buffered state. The periodic save function cancels the save if it detects that it is a periodic save. Please correct me if I am wrong.

Periodic save is only cancelled if the data file has changed.

Let's assume that calcurse is set up with a post-save hook to syncronize with a remote server (like the one in contrib) and with periodic save enabled.

The user starts an interactive calcurse session and makes some changes (adds an appoinment, edits a note, whatever ...). After a save (either deliberate or by the next run of periodic save) the post-save hook is run. Assume that remote synchronization results in an updated data file apts:

periodic/interactive save --> remote sync (calcurse-caldav) --> data file update

The update happens unknown to the interactive calcurse, so at this point a reload of the data file is required, but does not happen automatically. Assume that the user issues a reload command and that no changes were made since the save. The reload will succeed and calcurse will report "Data were succesfully reloaded". If, on the other hand, the user makes further changes, the next save will encounter a "save conflict". If periodic, it will cancel the save and raise a system event to tell that an interactive save is required. If that is performed, the user must choose between a forced save, an interactive merge or a cancellation.

The "save conflict"-case leads to the merge scenario and the wish for an automatic merge. But I believe that in the typical use case there will be no save conflict. And the typical case should work automatically (without user intervention). What is missing is obviously an automatic reload after the remote sync:

save --> postsave/remote sync --> data file update --> (automatic) reload

A reload may be triggered from the post-save hook by sending a SIGUSR1 signal to calcurse. I have done some experiments and the outcome is mixed. In case calcurse is idle in the main input loop, waiting for a user command, the result is OK, and calcurse displays "Data were succesfully reloaded". But in other cases user activity is interrupted without warning, calcurse returns to the main input loop and an external program (pager, editor) keeps running. The reason is that the signal handler is too simple-minded. I am working on an improvement.

With this in place, I believe that the typical use case is covered without a need for an automatic merge.

There is a time lag between save and reload during which the user may make further changes; for example if the user is busy at the keyboard when periodic save runs. In this case an automatic merge may come in handy.

lhca avatar Jul 31 '20 19:07 lhca

I would suggest we do the same as with the .new files. we simply append the .old to the file path.

Soounds good.

If your are doing something in calcurse and an auto merge creates a conflict, it would be unproductive to disturb the workflow with a popup or a question what to do. There should be only a visual indication that the merge failed. Then the user can merge outside of calcurse or use a shortcut to open the merge editor.

I don't think I agree. A merge conflict should be solved immediately, before making further changes to anything. Making it harder for the user to resolve the conflict likely isn't a good idea. We can still give them a menu on how to proceed with the merge where "ignore" can be one of the options but opening the mergetool should be in there too.

How are we going to explain the partial merge to the user? A simple solution, but not a pretty one, would be to take the differing appointments and simply add them to the rest. Now the user Has both and he/she can decide which one to delete.

That works. If you intend to use git-merge-file, the default behavior (which marks problematic areas) might be an alternative too although I am not sure it would work great with the way items can move around in the calcurse data files. We should run a few experiments.

Your Idea to show the patched local and the remote is really good. But how do you plan on showing this? Do you want to show both alongside each other (like visual studio or vs code) or do you want to show them in your current layout with different colors.

We can use any mergetool the user likes. vimdiff is an option for vim users but there are many others that may behave differently. This is currently configurable using the $MERGETOOL environment variable.

I would suggest we store a string containing the binary config in the calcurse config.

All other external binaries (editor, pager, mergetool) are currently configurable using environment variables, so I think it'd make sense to use an environment variable for this too.

Depending on our next steps I would suggest to change from diff and patch to git merge-file. With git merge-file, the first idea to just add both conflicting items is really easy to implement with it's --union flag.

Depending on the next steps, that could be a reasonable choice, agreed.

lfos avatar Aug 01 '20 12:08 lfos