org-anki icon indicating copy to clipboard operation
org-anki copied to clipboard

fix the problem that some ANKI_NOTE_IDs could be missing

Open whatacold opened this issue 1 year ago • 15 comments

Hi,

I've found that problem the result in this issue: #45 , when we're trying to sync multi notes at the same time, the point field of a note struct could possibly change after inserting some note ids from other addNote replies, so that it fails to find org item using that stale point.

I'm trying to fix it by replacing the point with a marker, it works ok. But a lot of markers in a buffer will make emacs slow, so I delete them afterwards as per docs, after we're done with the sync or delete command. But there is another problem here, it looks like that, in org-anki--sync-notes it will delete the markers before the promise-chain.

I'm confused here as I'm not familiar with promise. I'm guessing that the callbacks are called async, but my helper command based on org-anki-sync-entry did block, so I don't know how it works under the hood. And my helper command is like:

(defun w/org-anki-sync-subentries ()
  "Sync all subentries of the current entry at point.

Also respect `org-anki-skip-function'."
  (interactive)
  (save-excursion
    (let* ((next-level (1+ (org-current-level))))
      (org-map-entries (lambda ()
                         (when (and t
                                    (= next-level (org-element-property :level (org-element-at-point))))
                           (org-anki-sync-entry)))
                       nil
                       'tree
                       org-anki-skip-function))))

What's the proper way to delete the markers correctly and safely?

whatacold avatar Sep 09 '23 15:09 whatacold

I think using a marker could also resolve #55 , as we can get the target buffer using marker-buffer.

whatacold avatar Sep 10 '23 00:09 whatacold

P.S. I observed that it used multi action even if there was only one action inside while I was debugging, is it normal?

whatacold avatar Sep 10 '23 00:09 whatacold

Hi Markus,

I think maybe there are 3 ways to delete the markers:

  1. Set the marker to nil on every succ and error callback, but it's cumbersome and hard to maintain the code.
  2. Use some "destructor" mechanism for note, it seems finalizer can be helpful, but I failed to figure out how to use it.
  3. Keep a state inside the note struct, and mark the state as completed in both succ and error callbacks. And then save all notes in an internal list, and check the list periodically. This would also be cumbersome, but it can works, although not in an elegant way.

What do you think? Any other ideas?

whatacold avatar Sep 10 '23 09:09 whatacold

Hi Markus,

I think maybe there are 3 ways to delete the markers:

1. Set the marker to nil on every succ and error callback, but it's cumbersome and hard to maintain the code.

2. Use some "destructor" mechanism for note, it seems [finalizer](https://www.gnu.org/software/emacs/manual/html_node/elisp/Finalizer-Type.html) can be helpful, but I failed to figure out how to use it.

3. Keep a state inside the note struct, and mark the state as completed in both succ and error callbacks. And then save all notes in an internal list, and check the list periodically.
   This would also be cumbersome, but it can works, although not in an elegant way.

What do you think? Any other ideas?

Hmm, maybe this marker thing isn't a problem, after reading Drew's answer here: https://stackoverflow.com/questions/36926513/are-there-elisp-functions-that-list-markers-in-a-given-buffer

So the list of markers you can find in the C source contains "real markers" as well as "zombie markers", i.e. markers which have become unreachable and will be eliminated at the next GC.

When a note gets GC'ed, it will no longer references the marker so that the marker will become a zombie, and it will get eliminated eventually.

whatacold avatar Sep 10 '23 09:09 whatacold

Hi @hwiorn @threddast,

If I am understanding correctly, this branch will also fix your problem, would you confirm that? thanks

whatacold avatar Sep 10 '23 10:09 whatacold

Hi @whatacold, thank you for your efforts! I haven't had a chance to replicate the bug, but here are my initial thoughts:

  • the ANKI_NOTE_IDs are added to entries backwards with regarding to where they are located in the file, so (theoretically) the point in the note struct should be a locations to run goto-char/org-set-property (because we are always adding the property block to the org entry from latest to earliest)
  • there is a performance penalty with having markers because every insertion/deletion will (AFAIU, maybe I'm wrong?) go through all of the markers for that buffer and adjust them according to whether the text was added before the marker (= marker needs to change) or after (= no change required)
  • the docs say that setting a marker to nil is the way to "delete them": garbage collector will come and remove these when they are out of scope, but at least they won't be considered when new text was added or removed

Although I use promises, then these are used only for the API requests to AnkiConnect (the buffer isn't changed by multiple threads at the same time so adding the note IDs backwards should still work).


I'm still in process in trying to replicate this bug, just tried a test file with 1000 entries and this seemed to work. Maybe it's something about syncing subtrees or multiple files that triggers the bug..

eyeinsky avatar Sep 10 '23 18:09 eyeinsky

Hi @eyeinsky ,

I think you can't reproduce the problem is because M-x org-anki-sync-all (I guess you use this command?) puts all the actions into a multi action, it's ok as there is only one curl callback need to be called, so that point mechanism work ok after you sorting notes by their points.

I believe you can do that if you call multiple org-anki-sync-entry times simultaneously, like I did in w/org-anki-sync-subentries, or the problem mentioned in the original issue report.

Here is the org file that I used to reproduce this problem:

* dumb root
:PROPERTIES:
:ANKI_DECK: org-anki::bug-batch-sync
:END:

** 0
0
** 1
1
** 2
2
** 3
3
** 4
4
** 5
5
** 6
6
** 7
7
** 8
8
** 9
9

Unfortunately, I don't know markers well either. What are other options if we don't use it, overlay?

whatacold avatar Sep 11 '23 14:09 whatacold

Thanks, I think I now understand why this issue occurs! What about collecting the notes into a list in the lambda in org-map-entries and then call the multi with that list? The reason to use multi at all was that running one HTTP request per org entry was slow for large files.

eyeinsky avatar Sep 11 '23 14:09 eyeinsky

Sorry, I should post my reproduction recipe first.

What about collecting the notes into a list in the lambda in org-map-entries and then call the multi with that list?

Yeah, for this specific case I can refactor my helper function like that. But it will still be a problem whenever there are multi requests being executed "concurrently", especially for people unfamiliar with the code.

The reason to use multi at all was that running one HTTP request per org entry was slow for large files.

Hmm, why is it slow? After all, the request lib do it async. And I also remember that emacs freezed when I was sync'ing a lot for entries simultaneously before, still don't know why.

whatacold avatar Sep 11 '23 15:09 whatacold

When syncing multiple entries then running a separate HTTP request for each will add some overhead because a separate curl process is started for each, HTTP headers are sent and received multiple times (these are likely mostly identical). And although one can initiate them concurrently (in effect pipelining them) then as they come back they need a synchronization mechanism such that the results won't clash. This is where the markers help I guess.

Hmm, if markers do solve the issue then I wonder how that compares to doing one multi request? As in: changing the buffer would need to be synchronized somehow anyway, but maybe having a queue of arrived responses is faster than collecting a list, doing one multi request and then handling the response.

eyeinsky avatar Sep 11 '23 15:09 eyeinsky

When syncing multiple entries then running a separate HTTP request for each will add some overhead because a separate curl process is started for each, HTTP headers are sent and received multiple times (these are likely mostly identical).

Then I think it's mostly due to starting a curl process for every note, sending and receiving HTTP headers and bodies are curl's job, it should have much impact on emacs side.

And although one can initiate them concurrently (in effect pipelining them) then as they come back they need a synchronization mechanism such that the results won't clash. This is where the markers help I guess.

In my current implementation, there is no synchronization for replies, it just adds the note id for whoever comes first, and the marker promises that goto-char goes to the right position.

Hmm, if markers do solve the issue then I wonder how that compares to doing one multi request? As in: changing the buffer would need to be synchronized somehow anyway, but maybe having a queue of arrived responses is faster than collecting a list, doing one multi request and then handling the response.

I'm a little confused here, what does "having a queue of arrived responses" for, isn't there only one reply if using a multi request?

My concern is that we can only guarantee to use multi requests in org-anki.el itself whenever possible, as it already did in org-anki-sync/update/delete-all, but it's hard to make sure that users' code to do that, which could possiblly run into problems as it did here.

whatacold avatar Sep 11 '23 15:09 whatacold

In my current implementation, there is no synchronization for replies, it just adds the note id for whoever comes first, and the marker promises that goto-char goes to the right position.

Right, what I meant was the markers indirectly help with synchronization (they must, because they fix the issue).

I'm a little confused here, what does "having a queue of arrived responses" for, isn't there only one reply if using a multi request?

Nevermind, I messed up with this thought :) (I was just wondering about running multiple org-anki-sync-entry and then instead of using markers, use some global list to add the responses to, and then another (single) thread adds these to the file one-by-one. But this won't work still, as the responses can be out-of-order..)

eyeinsky avatar Sep 11 '23 16:09 eyeinsky

I'm a little confused here, what does "having a queue of arrived responses" for, isn't there only one reply if using a multi request?

Nevermind, I messed up with this thought :) (I was just wondering about running multiple org-anki-sync-entry and then instead of using markers, use some global list to add the responses to, and then another (single) thread adds these to the file one-by-one. But this won't work still, as the responses can be out-of-order..)

Right, I also thought of a similar one as mentioned above, as below:

Keep a state inside the note struct, and mark the state as completed in both succ and error callbacks. And then save all notes in an internal list, and check the list periodically. This would also be cumbersome, but it can works, although not in an elegant way.

Maybe we could do some experiments here to see how markers hurt the editing performance in a file, for example, sync or update 1000 entries in a relatively big file, and then do some basic insertion/deletion operations to see if emacs becomes slow.

What do you think?

whatacold avatar Sep 11 '23 16:09 whatacold

@whatacold Yup, measuring performance would be great! I'm not against markers, I was just thinking that doing a single mult request is both simpler and faster. But if it isn't, then we could try markers instead.

But if it does turn out that multi is faster, then we could add some mechanism to display a message when org-anki-sync-entry discovers that more than one of them is running at the same time.

eyeinsky avatar Sep 12 '23 08:09 eyeinsky

Hi @eyeinsky,

After some considerations, I think your way is better. Marker is a little bit risky to use here, we have other options.

So I change the code a bit and put a note in the docs for org-anki-sync-entry.

whatacold avatar Oct 06 '23 04:10 whatacold