jabref
jabref copied to clipboard
Fix exception when saving and autosave trigger at the same time
Fixes #6684 Fixes #6644 Fixes #6102 Fixes #6000
Summary: Manual save + autosave produce an exception/race condition on saving, because both want to access the same file for moving the temp file to the target file.
Solution: Throttle multiple saving operations in a kind of queue
- [ ] Change in CHANGELOG.md described (if applicable)
- [ ] Tests created for changes (if applicable)
- [x] Manually tested changed features in running JabRef (always required)
- [ ] Screenshots added in PR description (for UI changes)
- [ ] Checked documentation: Is the information available and up to date? If not created an issue at https://github.com/JabRef/user-documentation/issues or, even better, submitted a pull request to the documentation repository.
I wouldn't remove auto-saving. Actually, I feel this should be the standard behavior: why do I need to remember as a user to save the file?
I will try if I can find a way to lock the saving to one thread
That might lead to performance problems: saving might take a few seconds so in the worst case the save requests add up. The same with "Save all" etc.
I think the underlying problem is really that the tmp file is reused between save operations. This is really not a good idea (also for data integrity).
I'm open for ideas: If you have autosave running and manually hit saving you run into problems. Even if you would write to another temp file first that doesn't help. You will get an exception nonetheless as the file is locked by another thread when it is moved. The idea would be cancel any further attempts of saving of the same database when one is already running.
You will get an exception nonetheless as the file is locked by another thread when it is moved.
Which file is locked?
My idea was:
- Write to unique temp file
- Copy temp file over target file
The first step should only lock the temp file, and write to it. There is no problem there since every write process has its own temp file, and there is no interaction. The second step is atomic, so there is no locking or inter-thread problem.
In addition to these technicalities, I think it be worthwhile to improve the user experience a bit. I guess it would be good to hard-cancel every running auto-save and normal save (and delete the temporary file), as soon as the user runs a new save for the same file. This reduces overhead.
@tobiasdiez Even though you have atomic moving, the other thread will throw an IllegalAccessException.
Following situation:
- Thread Autosave A
- Thread Manual Save B
A writes to temp files and now calls the Atomic File Move => Performs Atomic Moves temp file A to target file B writes to temp file and also wants to call the Atomic File Move (while the other thread is still moving) => Boom! You will get an IllegalAcessException for the target file because the other thread is still in the process of moving and probably has a lock on that file.
To reproduce: Set a breakpoint in that close method of AtomicFileOutputstream Type something in a field, wait for autosave to trigger Manually hit save.
the only way one could solve this would be to cancel the running? save operation
But this exception is coming from this lock, isn't it? (Which comes from the fact that we are writing to the same tmp file) https://github.com/JabRef/jabref/blob/ee2c9857c8a31ee6a45b6e360e13c80620e563bb/src/main/java/org/jabref/logic/exporter/AtomicFileOutputStream.java#L85
Since Files.move(...)
is atomic, I cannot set a breakpoint in it - so it's hard to test if an exception is thrown if two request for moving to the same target are received. I would expect that if thread A calls Files.move(...)
, moving started, thread B calls Files.move(...)
while move is in progress, then B simply waits until A is finished (since the move is atomic, the file system shouldn't be able to do anything in between and just queue further moves).
The temporary file lock is never set, its always null because the Files... Returns directly a File Channel. (see the comment in the code in my changes). It's sufficient to set a breakpoint in the close method, wait for both threads to arrive and then continue running. Atomic move will only work atomic if both files are on the same file disk (at least for unix)
I will update the code to show the problem
Good! Can you also please add a comment at the place where the exception is thrown (I don't really have time right now to play with this in detail).
I tested the idea to create the file in the System's TEMP Folder, but as I expected an atomic move is not possible if the TEMP folder is on a different disk than the library file. (My Temp folder is on a SSD, while my jabref bib is on a different HDD, "Atomic move between providers is not supported)
I tested you idea with a unique file name in the same folder which results in an exception when multipe threads try to acess the file. I commented in the code where to put the breakpoint.
Locking the section could solve this, but the problem is that when the saving is done manually, it's the FX thread which executes the save operation. So that should also be a background task
1:25:40.172 [pool-6-thread-1] ERROR org.jabref.logic.autosaveandbackup.BackupManager - Error while saving to fileX:\Users\CS\Documents\_JABREFTEMP\newBiblatexTEst.bib.sav
java.nio.file.AccessDeniedException: X:\Users\CS\Documents\_JABREFTEMP\newBiblatexTEst.bib.sav112661705549979460.tmp -> X:\Users\CS\Documents\_JABREFTEMP\newBiblatexTEst.bib.sav
at sun.nio.fs.WindowsException.translateToIOException(WindowsException.java:89) ~[?:?]
at sun.nio.fs.WindowsException.rethrowAsIOException(WindowsException.java:103) ~[?:?]
at sun.nio.fs.WindowsFileCopy.move(WindowsFileCopy.java:309) ~[?:?]
at sun.nio.fs.WindowsFileSystemProvider.move(WindowsFileSystemProvider.java:292) ~[?:?]
at java.nio.file.Files.move(Files.java:1426) ~[?:?]
at org.jabref.logic.exporter.AtomicFileOutputStream.close(AtomicFileOutputStream.java:203) ~[main/:?]
at sun.nio.cs.StreamEncoder.implClose(StreamEncoder.java:353) ~[?:?]
at sun.nio.cs.StreamEncoder.close(StreamEncoder.java:168) ~[?:?]
at java.io.OutputStreamWriter.close(OutputStreamWriter.java:255) ~[?:?]
at org.jabref.logic.exporter.BibDatabaseWriter.savePartOfDatabase(BibDatabaseWriter.java:221) ~[main/:?]
at org.jabref.logic.exporter.BibDatabaseWriter.saveDatabase(BibDatabaseWriter.java:161) ~[main/:?]
at org.jabref.logic.autosaveandbackup.BackupManager.performBackup(BackupManager.java:139) ~[main/:?]
at java.util.Optional.ifPresent(Optional.java:176) ~[?:?]
at org.jabref.logic.autosaveandbackup.BackupManager.lambda$4(BackupManager.java:164) ~[main/:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) [?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
at java.lang.Thread.run(Thread.java:832) [?:?]
21:25:40.184 [JavaFX Application Thread] INFO org.jabref.gui.JabRefDialogService - Saving library...
Yeah, seems like we need a wrapper that controls which save operations are in progress, and if there is already one for the target file, then stops the old one and only then starts the new one... damn.
Thanks for your time investigating this!
@tobiasdiez's one works, because we use temp files.
Worker-Queue of Guava?
An ADR is required ^^.
Idea: put the logic into save database action
In https://github.com/JabRef/jabref/pull/5669#issuecomment-560167766 - I put synchronized
here and there (with thinking where exactly it shoulb be). Maybe, this could help here, too?
This somehow relates to https://github.com/JabRef/jabref/pull/5967, but I don't know, how.
Synchronizing won't help here (always a new instance created) and I already have an idea how to fix this
What's the status here?
Work in progress. Have an idea to cancel the running or queue the new save actions
The JabRef maintainers will add the following name to the AUTHORS file. In case you want to use a different one, please comment here and adjust your name in your git configuration for future commits.
Nils Streijffert
The JabRef maintainers will add the following name to the AUTHORS file. In case you want to use a different one, please comment here and adjust your name in your git configuration for future commits.
m-mauersberger
The JabRef maintainers will add the following name to the AUTHORS file. In case you want to use a different one, please comment here and adjust your name in your git configuration for future commits.
chenyuheng
The JabRef maintainers will add the following name to the AUTHORS file. In case you want to use a different one, please comment here and adjust your name in your git configuration for future commits.
Fabio Marcos
@tobiasdiez I thought about your request to move parts of this into logic and tried to check for a way, but it's not really possible as this saving process involes GUI feedback. I am currently trying to solve the problem with the saveWithDifferentEncoding which also involves GUI. Argh
Sure, if it's not possible to extract the logic part right now, then this can be done later. However, I would still keep the manager (add save action, shutdown etc) in an extra class, separated from the action save action.
The JabRef maintainers will add the following name to the AUTHORS file. In case you want to use a different one, please comment here and adjust your name in your git configuration for future commits.
MLEP
I've completely refactored it and as @tobiasdiez suggested moved parts to logic. However, there currently is still an issue. The save actions are all executed after each other, normally they should be canceled if already running. That works only for the Backup/Autosave manager not for my new save actions. There is soehow an issue I need to investigate.
Fuu. Now the autosave is broken and the saving is still not canceling otherwise. Argh. It's a shit.
I somehow need an only instance of the TaskEecutor otherwise it does not make sense with canceling. Need ot hink about the architecutre
What's the status here? 😅
Not even close to working