koschei
koschei copied to clipboard
UnappliedChange table can run out of IDs under normal operation
In production, repo resolver service keeps crashing with the following error:
Traceback (most recent call last):
File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context
context)
File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/default.py", line 470, in do_execute
cursor.execute(statement, parameters)
psycopg2.DataError: integer out of range
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/usr/lib/python3.6/site-packages/koschei/backend/main.py", line 60, in <module>
main()
File "/usr/lib/python3.6/site-packages/koschei/backend/main.py", line 52, in main
svc(backend.KoscheiBackendSession()).run_service()
File "/usr/lib/python3.6/site-packages/koschei/backend/service.py", line 85, in run_service
self.main()
File "/usr/lib/python3.6/site-packages/koschei/backend/services/repo_resolver.py", line 29, in main
self.process_repo(collection)
File "/usr/lib/python3.6/site-packages/koschei/backend/services/resolver.py", line 734, in process_repo
self.resolve_packages(sack, collection, new_packages)
File "/usr/lib/python3.6/site-packages/koschei/backend/services/resolver.py", line 535, in resolve_packages
self.generate_dependency_changes(sack, collection, packages, brs)
File "/usr/lib/python3.6/site-packages/koschei/backend/services/resolver.py", line 455, in generate_dependency_changes
self.persist_resolution_output(results)
File "/usr/lib/python3.6/site-packages/koschei/util.py", line 309, in decorated
res = fn(*args, **kwargs)
File "/usr/lib/python3.6/site-packages/koschei/backend/services/resolver.py", line 378, in persist_resolution_output
self.db.execute(insert(UnappliedChange, dependency_changes))
File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/session.py", line 1154, in execute
bind, close_with_result=True).execute(clause, params or {})
File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 945, in execute
return meth(self, multiparams, params)
File "/usr/lib64/python3.6/site-packages/sqlalchemy/sql/elements.py", line 263, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1053, in _execute_clauseelement
compiled_sql, distilled_params
File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1189, in _execute_context
context)
File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1402, in _handle_dbapi_exception
exc_info
File "/usr/lib64/python3.6/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/usr/lib64/python3.6/site-packages/sqlalchemy/util/compat.py", line 186, in reraise
raise value.with_traceback(tb)
File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/base.py", line 1182, in _execute_context
context)
File "/usr/lib64/python3.6/site-packages/sqlalchemy/engine/default.py", line 470, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.DataError: (psycopg2.DataError) integer out of range
We have run out of IDs for unapplied_change
table. Since unapplied changes are temporary data, which will be regenerated, we can just delete them all and reset the sequence. I'll do that in production:
delete from unapplied_change;
alter sequence unapplied_change_id_seq restart;
To prevent this in the future, I could drop the useless primary key, but BDR doesn't permit tables without a primary key. Or (more complex, but likely better), I can change the unaplied_change table to be a relation table between package and dependency, similarly to what I did for applied_change.
Production fixed by the above.
This just happened again in Fedora production.
We are not using BDR any longer. I will see if I can remove the primary key. Until then I will try to manually fix production DB as described in https://github.com/msimacek/koschei/issues/234#issuecomment-365898386
This happened again a few days ago.
This happened again in production
This just happened again: https://pagure.io/fedora-infrastructure/issue/11636
This just happened again. ;)
And again. ;)
Latest Koschei has been deployed in production in Fedora infrastructure. This upstream fix should be live in Fedora now.