TagStudio icon indicating copy to clipboard operation
TagStudio copied to clipboard

[Bug]: Library sometimes deleted when application closed

Open Nividica opened this issue 1 year ago • 4 comments

Checklist

  • [X] I am using an up-to-date version.
  • [X] I have read the documentation.
  • [X] I have searched existing issues.

TagStudio Version

Alpha v9.3.2

Operating System & Version

Windows 10

Description

Occasionally when closing Tag Studio, via the X button, it will wipe the ts_library.json file, leaving it with zero bytes, despite the logs claiming the library was saved.

I spent some time fiddling around with the code and stumbled upon a potential solution: Move the call to self.to_json() outside of the with open( block in the save_library_to_disk function.

Example

        # If `to_json` is called in the `with` block, the file will sometimes fail to write
        jsonLibrary = self.to_json()

        with open(
            self.library_dir / TS_FOLDER_NAME / filename, "w", encoding="utf-8"
        ) as outfile:
            outfile.flush()
            ujson.dump(
                jsonLibrary,
                outfile,
                ensure_ascii=False,
                escape_forward_slashes=False,
            )

Note that having jsonLibrary = self.to_json() inside the with open( block did not fix the issue.

I stumbled into this when I was logging the size of the jsonLibrary to see if it was actually being populated, and comparing the sizes at different places in the function. (It was consistently the same size, regardless of if it wrote the bytes to disk or not.)

With that change in place I was able to go through over a dozen open/close cycles and it saves the library each time. Where previously it would fail to save by the time I got to close number 5

I can't pretend to understand why this seems to have solved the issue, so I can not guarantee the robustness of the proposed fix.

Other Things I Tried

  • Moving outfile.flush() after the ujson.dump call
  • Adding outfile.close() after the ujson.dump call

Work Around

During my testing I discovered if I manually save the library then immediately close the app, the bug never seemed to happen.

Additional Info

  • My library
    • is stored on an internal SSD
    • is not stored on the C:\ disk
    • has a little over 4K files in it
    • json file is ~340K
  • This may be related to #320

Expected Behavior

Data should be written to the library file on exit

Steps to Reproduce

  1. Open Tag Studio
  2. Create a new library
  3. Manually save
  4. Note the size of the save file (mine is ~340KB)
  5. Close Tag Studio
  6. Open Tag Studio
  7. Wait for it to load the library
  8. Close Tag Studio
  9. Check the size of the save file
  10. If the save file is not 0 bytes, repeat steps 6-9

Logs

Log output when it fails to save

Saving Library...
[LIBRARY] Saving Library to Disk...
[LIBRARY] Formatting Tags to JSON...
[LIBRARY] Formatting Entries to JSON...
[LIBRARY] Formatting Collations to JSON...
[LIBRARY] Done Formatting to JSON!
[LIBRARY] Library saved to disk in 0.008 seconds
[SHUTDOWN] Ending Thumbnail Threads...

Log output when it successfully saves

Saving Library...
[LIBRARY] Saving Library to Disk...
[LIBRARY] Formatting Tags to JSON...
[LIBRARY] Formatting Entries to JSON...
[LIBRARY] Formatting Collations to JSON...
[LIBRARY] Done Formatting to JSON!
[LIBRARY] Library saved to disk in 0.031 seconds
[SHUTDOWN] Ending Thumbnail Threads...

Nividica avatar Aug 08 '24 07:08 Nividica

I've continued to try and track down the root cause of the issue, and it seems like it is a timing issue. Potentially one outside of the app.

Starting with the original code, I added a print for the file path. Then if I print the size of the file before and after a call to sleep, they disagree with one another

jsonPath = self.library_dir / TS_FOLDER_NAME / filename
print(f"[LIBRARY] Save path: {jsonPath}")
with open(
    jsonPath, "w", encoding="utf-8"
) as outfile:
    outfile.flush()
    ujson.dump(
        self.to_json(),
        outfile,
        ensure_ascii=False,
        escape_forward_slashes=False,
    )
    # , indent=4 <-- How to prettyprint dump
print(f"[LIBRARY] Pre-sleep Size: {jsonPath.stat().st_size} bytes")
time.sleep(0.1)
print(f"[LIBRARY] Post-sleep Size: {jsonPath.stat().st_size} bytes")

On my 3rd close, the bug happened:

[LIBRARY] Saving Library to Disk..
[LIBRARY] Save path: E:\Screenshots\.TagStudio\ts_library.json
[LIBRARY] Formatting Tags to JSON...
[LIBRARY] Formatting Entries to JSON...
[LIBRARY] Formatting Collations to JSON...
[LIBRARY] Done Formatting to JSON!
[LIBRARY] Pre-sleep Size: 348937 bytes
[LIBRARY] Post-sleep Size: 0 bytes
[LIBRARY] Library saved to disk in 0.140 seconds

Then I added a call to sleep just before ujson.dump, and the bug magically goes away.

with open(
    jsonPath, "w", encoding="utf-8"
) as outfile:
    outfile.flush()
    time.sleep(0.1)
    ujson.dump(
        self.to_json(),
        outfile,
        ensure_ascii=False,
        escape_forward_slashes=False,
    )
    # , indent=4 <-- How to prettyprint dump
print(f"[LIBRARY] Pre-sleep Size: {jsonPath.stat().st_size} bytes")
time.sleep(0.1)
print(f"[LIBRARY] Post-sleep Size: {jsonPath.stat().st_size} bytes")
[LIBRARY] Saving Library to Disk..
[LIBRARY] Save path: E:\Screenshots\.TagStudio\ts_library.json
[LIBRARY] Formatting Tags to JSON...
[LIBRARY] Formatting Entries to JSON...
[LIBRARY] Formatting Collations to JSON...
[LIBRARY] Done Formatting to JSON!
[LIBRARY] Pre-sleep Size: 348937 bytes
[LIBRARY] Post-sleep Size: 348937 bytes
[LIBRARY] Library saved to disk in 0.227 seconds

Ran this test 10 times, and no bug.

As a final set of tests I went back to the original code with just the sleep inside the with-block

with open(
    self.library_dir / TS_FOLDER_NAME / filename, "w", encoding="utf-8"
) as outfile:
    outfile.flush()
    time.sleep(0.1)
    ujson.dump(
        self.to_json(),
        outfile,
        ensure_ascii=False,
        escape_forward_slashes=False,
    )
    # , indent=4 <-- How to prettyprint dump

Ran this test 10 times, and no bug.

So there seems to be two independent "fixes" for the bug

  1. Move self.to_json() outside of the with-block
  2. Add time.sleep(0.1) just before ujson.dump

I can't wrap my head around what is going on here. Option 2 would suggest there isn't enough time between opening the file and writing to the file. Why would there need to be an artificial wait added between open and write?

If I drastically increase the sleep time for Option 2 to several seconds I can see that when the file is opened for writing the existing contents are, expectedly, cleared with a file size of 0, the data is saved, and the file size goes to ~340K

But none of that explains why Option 1 would have any effect at all, and yet I can't repro the bug when self.to_json() is outside of the with-block.

If the source of the bug really is some kind of race condition, then I'm starting to wonder if a true fix for this would be to save a new library file, delete the old library file, then rename the new file.

Nividica avatar Aug 09 '24 06:08 Nividica

I ran with the temp file idea and have not run into the bug again

        tempJsonPath = self.library_dir / TS_FOLDER_NAME / "temp.json"
        jsonPath = self.library_dir / TS_FOLDER_NAME / filename

        # Delete temp file if exists
        tempJsonPath.unlink(missing_ok=True)

        # Write to a temp file
        with open(
            tempJsonPath, "w", encoding="utf-8"
        ) as outfile:
            ujson.dump(
                self.to_json(),
                outfile,
                ensure_ascii=False,
                escape_forward_slashes=False,
            )
        
        # Delete existing libary
        jsonPath.unlink(missing_ok=True)

        # Rename temp file
        tempJsonPath.rename(jsonPath)

In my opinion, of the three solutions, this one feels like the winner.

Pros

  • No weird calls to sleep
  • No worries about someone accidentally "optimizing" the code by getting rid of a variable and moving self.to_json() back inside the with-block
  • If something goes wrong, there is data saved somewhere.
    • If writing to the temp file fails or is interrupted by a crash or power cut, the original ts_library.json file is left untouched
    • If deleting ts_library.json fails, temp.json has the latest data
    • If renaming fails, temp.json still exists
  • The open_library code could be expanded to check for the temp.json and attempt to load it first
    • If temp.json fails to read/parse, fall back on ts_library.json
    • Could potentially check timestamps

Cons

  • Will briefly require twice as much disk space

Nividica avatar Aug 09 '24 07:08 Nividica

Just a minor note (which you will probably already think of yourself: there will at some point be a user that decides to name their database temp, at which point you will have a name collision. Perhaps the name would be a better choice? Still not guaranteed to be unique, but if you want a stable name, you need to stop at some point. jsonPath = self.library_dir / TS_FOLDER_NAME / filename+".temp.json" Or compute a hash of the file name and use that (+".json") as the temp file name.

thrynae avatar Aug 22 '24 14:08 thrynae

Also, looking at the code, if the application crashes between the "unlink" and the "rename" call, there would again be data loss (unless, as suggested, code will be added to recover from the temp file in that case).

Maybe instead of unlink and rename (which first deletes the database and then renames the temp one to the real database name) it'd be better to use os.replace.

That way you could call os.replace(temp_db, actual_db) without first deleting the actual database file, and the OS will guarantee (as of Python 3.3) that the move will either fail entirely (keeping the old DB) or succeed entirely (and you'll have the new file saved). That means that there's no need for extra code to recover from a temp file - the database will always be there no matter when the process was interrupted.

Also, at least to me it looks like the code currently in save_library_to_disk could fail on any OS, not just Windows. If the "open" call succeeds (and truncates the existing file) but then the json dump calls fail for whatever reason, the file will probably be corrupted.

Leseratte10 avatar Sep 09 '24 14:09 Leseratte10

Should be fixed as of #554, coming with the next 9.4.x patch. If the issue persists, I can reopen this issue.

CyanVoxel avatar Nov 04 '24 20:11 CyanVoxel