securedrop-client icon indicating copy to clipboard operation
securedrop-client copied to clipboard

Investigate more robust error-handling for database transactions in storage.py

Open rocodes opened this issue 3 years ago • 3 comments

Description

In working on eg https://github.com/freedomofpress/securedrop-client/pull/1432, I've had the chance to get more familiar with storage.py, and it might make sense to investigate improvements we could make to our database transaction handling.

For example: a complex method like this one where we have a single commit() at the end might benefit from savepoints and some opportunities to roll back an invalid transaction if one is encountered (and some agreements about what that behaviour should be). We might want to standardize on one procedure that we use whenever doing database transactions. As a later goal, we might want to wrap database errors in our own exceptions instead of handling them in GUI classes.

This could especially be relevant for future features like multi-select or multi-delete, or other scenarios where we could increase the likelihood of encountering database errors. I'm just filing this issue so that we can set aside a little time for this, and also, to discuss what we'd like the behaviour to be.

How will this impact SecureDrop users?

Makes Workstation more robust, more suitable for heavier use (large numbers of sources/transactions, multiple Workstation computers per server instance, etc)

How would this affect the SecureDrop Workstation threat model?

No major changes; potential benefits to stability and/or reliability of product

User Stories

As a Workstation user, I want a smooth operating experience, including clean error-handling and predictable behaviour with large numbers of database transactions

rocodes avatar Mar 15 '22 14:03 rocodes

More evidence in support of this inquiry: I experienced the following crash after (1) using the Journalist Interface to delete all but 3 of ~200 source accounts previously synced and then (2) reopening the Client, which got only partway through its initial sync:

Traceback (most recent call last):
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context
    cursor, statement, parameters, context
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
    cursor.execute(statement, parameters)
sqlite3.OperationalError: database is locked

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/securedrop-client/securedrop_client/gui/widgets.py", line 967, in schedule_source_management
    self.controller, source, self.source_selected, self.adjust_preview
  File "/home/user/securedrop-client/securedrop_client/gui/widgets.py", line 1223, in __init__
    self.seen = self.source.seen
  File "/home/user/securedrop-client/securedrop_client/db.py", line 155, in seen
    for item in self.collection:
  File "/home/user/securedrop-client/securedrop_client/db.py", line 119, in collection
    collection.extend(self.messages)
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/orm/attributes.py", line 276, in __get__
    return self.impl.get(instance_state(instance), dict_)
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/orm/attributes.py", line 682, in get
    value = self.callable_(state, passive)
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/orm/strategies.py", line 723, in _load_for_state
    session, state, primary_key_identity, passive
  File "<string>", line 1, in <lambda>
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/orm/strategies.py", line 865, in _emit_lazyload
    .with_post_criteria(set_default_params)
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/ext/baked.py", line 531, in all
    return list(self)
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/ext/baked.py", line 434, in __iter__
    return q._execute_and_instances(context)
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3342, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 988, in execute
    return meth(self, multiparams, params)
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement
    distilled_params,
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
    e, statement, parameters, cursor, context
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1466, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 383, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 128, in reraise
    raise value.with_traceback(tb)
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context
    cursor, statement, parameters, context
  File "/home/user/securedrop-client/.venv/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) database is locked
[SQL: SELECT messages.id AS messages_id, messages.uuid AS messages_uuid, messages.filename AS messages_filename, messages.file_counter AS messages_file_counter, messages.size AS messages_size, messages.download_url AS messages_download_url, messages.is_downloaded AS messages_is_downloaded, messages.is_decrypted AS messages_is_decrypted, messages.download_error_id AS messages_download_error_id, messages.is_read AS messages_is_read, messages.content AS messages_content, messages.source_id AS messages_source_id, messages.last_updated AS messages_last_updated 
FROM messages 
WHERE ? = messages.source_id ORDER BY messages.id]
[parameters: (22,)]
(Background on this error at: http://sqlalche.me/e/e3q8)

In other words, two Client sessions were necessary to catch up to a bulk operation performed in the Journalist Interface.

cfm avatar May 24 '22 21:05 cfm

In the course of #1634, I would zoom out from this goal and suggest instead that what we need is a consistent way to mutate state across the application, which is a separate concern from how that state is persisted (and refreshed). As it is, I believe we're effectively letting transactionality introduce data races within the running application.

#1634 will offer some baby steps for how we might think and go about this.

cfm avatar Jun 27 '23 01:06 cfm

Briefly, here are my findings from #1634 → #1661:

  1. Now: Models can undergo mutations in jobs (on queue threads) and after controller events (on the main thread). This has two implications:
    1. Practice: We commit() changes frequently and erratically. It falls to the controller to signal both (a) mutations that are implied by the completion of jobs and (b) mutations that are direct results of GUI actions and events.
    2. Theory: Our transaction boundaries do not consistently line up with our job boundaries or our event boundaries.
  2. Future: https://groups.google.com/g/sqlalchemy/c/p8BgLuJQdkI/m/__hg-HrkCAAJ closely resembles the Client's architecture and should guide us in how we refactor our database-access and state-mutation patterns. After #1634, this would mean, for example:
    1. binding GUI widgets to detached SQLAlchemy model instances...
    2. updated on state transitions signaled by the controller...
    3. listening for events from a single-threaded queue of SQLAlchemy commit() operations.

I have snippets for what these patterns could look like in the Client; they're just not workable at the level of individual models as in #1634 and #1661.

cfm avatar Jul 01 '23 05:07 cfm