parsec-cloud icon indicating copy to clipboard operation
parsec-cloud copied to clipboard

FSLocalStorageOperationalError on disk full

Open sentry-io[bot] opened this issue 2 years ago • 4 comments

Sentry Issue: PARSEC-QZW

OperationalError: disk I/O error
  File "parsec/core/fs/storage/local_database.py", line 74, in _manage_operational_error
    yield
  File "parsec/core/fs/storage/local_database.py", line 122, in _create_connection
    self._conn.execute("PRAGMA journal_mode=WAL")

FSLocalStorageOperationalError: 
(16 additional frame(s) were not displayed)
...
  File "parsec/core/fs/storage/local_database.py", line 39, in run
    await self._connect()
  File "parsec/core/fs/storage/local_database.py", line 130, in _connect
    await self._create_connection()
  File "parsec/core/fs/storage/local_database.py", line 123, in _create_connection
    self._conn.execute("PRAGMA synchronous=NORMAL")
  File "async_generator/_util.py", line 53, in __aexit__
    await self._agen.athrow(type, value, traceback)
  File "parsec/core/fs/storage/local_database.py", line 102, in _manage_operational_error
    raise FSLocalStorageOperationalError from exception

Uncatched error

sentry-io[bot] avatar May 11 '22 11:05 sentry-io[bot]

related to #2083

touilleMan avatar May 11 '22 11:05 touilleMan

Sentry issue: PARSEC-QZX

sentry-io[bot] avatar May 11 '22 11:05 sentry-io[bot]

Sentry issue: PARSEC-R04

sentry-io[bot] avatar May 11 '22 11:05 sentry-io[bot]

Here's a typical scenario related to this issue:

During a file synchronization, an access to the local database fails with an OperationalError. This might happen in set_clean_block due to the disk being full. OperationErrors are handle in a way to immediately clode the connection to the local data base in order to avoid a potential data corruption. The database being closed, an FSLocalStorageClosedError is generated in another nursery task that was running concurrently. A MultiError is then raised and logged which is good, because we want to know about those combined exceptions. This exception bubbles up until it closes the backend connection. The user gets a notification and checks the status of the synchronizing file, which fails again since the local database has been closed.

My conclusion is that everything mostly happen as we wanted so far. Note that when those errors happen during a file import (which is the most probable case) we're already prompting the user with a message asking them to check their disk space. So it's probable that the user:

  • created a large file
  • reached disk capacity
  • got the prompt about checking disk space
  • the new file still got synchronized (partially)
  • then scenario described above happenned

Another probable case for filling up the disk is opening large files that have been shared by another user.

In conclusion, things mostly go as we planned but there are clearly things that can be improved. The following questions need to be answered:

  • Should we prompt the user asking to check disk space in other cases than file import?
  • Should we have a better control on the re-connection to the local database?
  • Should we really close the backend connection during a local storage error?

A potential way of dealing with those issues is to log out the user when a failing local storage is detected, and check disk space when a user tries to login in order to prevent those issues in the first place.

vxgmichel avatar May 11 '22 13:05 vxgmichel