onyx icon indicating copy to clipboard operation
onyx copied to clipboard

Not reranked docs have no score --> Constraint Violation

Open scriptator opened this issue 1 year ago • 0 comments

When enabling the semantic re-ranker I get the following error in the chat mode:

  File "/app/danswer/chat/process_message.py", line 330, in stream_chat_message
    reference_db_search_docs = [
                               ^
  File "/app/danswer/chat/process_message.py", line 331, in <listcomp>
    create_db_search_doc(server_search_doc=top_doc, db_session=db_session)
  File "/app/danswer/db/chat.py", line 654, in create_db_search_doc
    db_session.commit()
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1906, in commit
    trans.commit(_to_root=True)
  File "<string>", line 2, in commit
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/state_changes.py", line 137, in _go
    ret_value = fn(self, *arg, **kw)
                ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1221, in commit
    self._prepare_impl()
  File "<string>", line 2, in _prepare_impl
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/state_changes.py", line 137, in _go
    ret_value = fn(self, *arg, **kw)
                ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 1196, in _prepare_impl
    self.session.flush()
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4154, in flush
    self._flush(objects)
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4290, in _flush
    with util.safe_reraise():
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/util/langhelpers.py", line 147, in __exit__
    raise exc_value.with_traceback(exc_tb)
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/session.py", line 4251, in _flush
    flush_context.execute()
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/unitofwork.py", line 467, in execute
    rec.execute(self)
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/unitofwork.py", line 644, in execute
    util.preloaded.orm_persistence.save_obj(
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/persistence.py", line 93, in save_obj
    _emit_insert_statements(
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/orm/persistence.py", line 1223, in _emit_insert_statements
    result = connection.execute(
             ^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1413, in execute
    return meth(
           ^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/sql/elements.py", line 483, in _execute_on_connection
    return connection._execute_clauseelement(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1637, in _execute_clauseelement
    ret = self._execute_context(
          ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1846, in _execute_context
    return self._exec_single_context(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1987, in _exec_single_context
    self._handle_dbapi_exception(
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 2344, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/base.py", line 1968, in _exec_single_context
    self.dialect.do_execute(
  File "/usr/local/lib/python3.11/site-packages/sqlalchemy/engine/default.py", line 920, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (psycopg2.errors.NotNullViolation) null value in column "score" of relation "search_doc" violates not-null constraint

The reason for that is that the function rerank_chunks (https://github.com/danswer-ai/danswer/blob/aa7c811a9ace1244f5f6aaf5d4bb8e1a0ea1b5bb/backend/danswer/search/search_runner.py#L440) sets the score of retrieved documents that were not re-ranked to None. Later, the chat flow inserts them as search docs into the DB, where the constraint violation exception is triggered.

My current fix is to drop the search docs which exceed the number query.num_rerank in the function rerank_chunks because I don't think they are really useful anyways. Is that a good solution?

scriptator avatar Feb 13 '24 09:02 scriptator