warehouse
warehouse copied to clipboard
Project.last_serial failed to update.
Reported by a bandersnatch user in #12214, the last_serial value for a Project was out of sync with the journals table after what appears to have been an automated mass removal of releases.
warehouse=> select id, name, version, action, submitted_date from journals where name='jarvis-ui' order by id desc limit 10;
id | name | version | action | submitted_date
----------+-----------+---------+----------------+----------------------------
16626043 | jarvis-ui | 0.3.3 | remove release | 2023-01-30 23:30:11.147139
16626042 | jarvis-ui | 0.3.8 | remove release | 2023-01-30 23:30:10.755916
16626041 | jarvis-ui | 0.4.0 | remove release | 2023-01-30 23:30:10.732833
16626040 | jarvis-ui | 0.3.5 | remove release | 2023-01-30 23:30:10.69155
16626039 | jarvis-ui | 0.3.1 | remove release | 2023-01-30 23:30:10.640855
16626038 | jarvis-ui | 0.3.2 | remove release | 2023-01-30 23:30:10.541308
16626037 | jarvis-ui | 0.2.5 | remove release | 2023-01-30 23:30:10.50915
16626036 | jarvis-ui | 0.3.0 | remove release | 2023-01-30 23:30:10.527536
16626035 | jarvis-ui | 0.4.8 | remove release | 2023-01-30 23:30:10.537667
16626034 | jarvis-ui | 0.4.3 | remove release | 2023-01-30 23:30:10.518038
(10 rows)
warehouse=> select last_serial from projects where name='jarvis-ui';
last_serial
-------------
16626041
(1 row)
Note that the last_serial column on the project record is not actually the latest serial. Seems the trigger that calls maintain_project_last_serial() for the journals table failed to update the project. I'll file a separate issue for that.
It has been corrected by calling a no-op update on the row for now.
warehouse=> begin;
BEGIN
warehouse=*> update journals set name=name where id=16626043;
UPDATE 1
warehouse=*> select * from journals where name='jarvis-ui' order by id desc limit 10;
id | name | version | action | submitted_date | submitted_by | submitted_from
----------+-----------+---------+----------------+----------------------------+---------------+-----------------------------------------
16626043 | jarvis-ui | 0.3.3 | remove release | 2023-01-30 23:30:11.147139 | thevickypedia | 2600:1702:1dc0:7820:6937:a99d:417e:de82
16626042 | jarvis-ui | 0.3.8 | remove release | 2023-01-30 23:30:10.755916 | thevickypedia | 2600:1702:1dc0:7820:6937:a99d:417e:de82
16626041 | jarvis-ui | 0.4.0 | remove release | 2023-01-30 23:30:10.732833 | thevickypedia | 2600:1702:1dc0:7820:6937:a99d:417e:de82
16626040 | jarvis-ui | 0.3.5 | remove release | 2023-01-30 23:30:10.69155 | thevickypedia | 2600:1702:1dc0:7820:6937:a99d:417e:de82
16626039 | jarvis-ui | 0.3.1 | remove release | 2023-01-30 23:30:10.640855 | thevickypedia | 2600:1702:1dc0:7820:6937:a99d:417e:de82
16626038 | jarvis-ui | 0.3.2 | remove release | 2023-01-30 23:30:10.541308 | thevickypedia | 2600:1702:1dc0:7820:6937:a99d:417e:de82
16626037 | jarvis-ui | 0.2.5 | remove release | 2023-01-30 23:30:10.50915 | thevickypedia | 2600:1702:1dc0:7820:6937:a99d:417e:de82
16626036 | jarvis-ui | 0.3.0 | remove release | 2023-01-30 23:30:10.527536 | thevickypedia | 2600:1702:1dc0:7820:6937:a99d:417e:de82
16626035 | jarvis-ui | 0.4.8 | remove release | 2023-01-30 23:30:10.537667 | thevickypedia | 2600:1702:1dc0:7820:6937:a99d:417e:de82
16626034 | jarvis-ui | 0.4.3 | remove release | 2023-01-30 23:30:10.518038 | thevickypedia | 2600:1702:1dc0:7820:6937:a99d:417e:de82
(10 rows)
warehouse=*> select last_serial from projects where name='jarvis-ui';
last_serial
-------------
16626043
(1 row)
warehouse=*> commit;
COMMIT
Originally posted by @ewdurbin in https://github.com/pypi/warehouse/issues/12214#issuecomment-1416771567
I'm trying to think how this could have happened, and I think I have a theory for it.
Multiple transactions occurring in isolation, READ COMMITTED
- Transaction A inserts record X into journals
- Transaction B inserts record Y into journals
- Transaction B updates
Project.last_serial, locking it. - Transaction A gets blocked on Transaction B.
- Transaction B commits, updating
Project.last_serialto Y. - Transaction A updates
Project.last_serial, locking it. - Transaction A commits, updating
Project.last_serialto X.
If that's right, I suspect we can fix it by always issuing a SELECT FOR UPDATE on the projects table immediately in any view that is going to create journals. That should effectively serialize any transactions that operate on the same project.
Another instance:
warehouse=> select id, name, version, action, submitted_date from journals where name='expose-localhost' order by id desc limit 10;
id | name | version | action | submitted_date
----------+------------------+---------+----------------+----------------------------
17357317 | expose-localhost | 0.4.2 | remove release | 2023-03-20 00:20:18.457385
17357316 | expose-localhost | 0.3.0 | remove release | 2023-03-20 00:20:18.45873
17357315 | expose-localhost | 0.3.1 | remove release | 2023-03-20 00:20:18.461733
17357314 | expose-localhost | 0.2.9 | remove release | 2023-03-20 00:20:18.431603
17357313 | expose-localhost | 0.1.8 | remove release | 2023-03-20 00:20:18.423596
17357312 | expose-localhost | 0.2.4 | remove release | 2023-03-20 00:20:18.367459
17357311 | expose-localhost | 0.3.4 | remove release | 2023-03-20 00:20:18.360073
17357310 | expose-localhost | 0.3.6 | remove release | 2023-03-20 00:20:18.373008
17357309 | expose-localhost | 0.3.5 | remove release | 2023-03-20 00:20:18.271077
17357308 | expose-localhost | 0.2.0 | remove release | 2023-03-20 00:20:17.959461
(10 rows)
warehouse=> select last_serial from projects where name='expose-localhost';
last_serial
-------------
17357314
(1 row)
Manually resolved.
Two packages, cyaxares and later openmldb-tool has been failing for days.
I wonder if it's another instance of this issue, and if there's someone going to fix it manually.
Two packages,
cyaxaresand lateropenmldb-toolhas been failing for days.I wonder if it's another instance of this issue, and if there's someone going to fix it manually.
Could anybody look into this? Maybe @ewdurbin ?
Another instance:
warehouse=> select id, name, version, action, submitted_date from journals where name='cyaxares' order by id desc limit 20;
id | name | version | action | submitted_date
----------+----------+---------+-------------------------------------+----------------------------
18070374 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:36.845118
18070372 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:36.567462
18070371 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:36.386991
18070370 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:36.304009
18070369 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:36.118203
18070368 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:35.790249
18070365 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:27.775461
18070364 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:27.586464
18070363 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:27.414776
18070362 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:27.24229
18070361 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:27.073759
18070360 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:26.903427
18070358 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:26.06091
18070356 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:24.592312
18070353 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:15.375401
18070352 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:15.071967
18070351 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:13.089239
18070348 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:09.134356
18070270 | cyaxares | 0.2 | remove file cyaxares-0.2.tar.gz | 2023-05-11 22:08:59.822525
18070249 | cyaxares | 0.2 | add source file cyaxares-0.2.tar.gz | 2023-05-11 22:07:59.015508
(20 rows)
warehouse=> select last_serial from projects where name = 'cyaxares';
last_serial
-------------
18070352
(1 row)
Resolved:
warehouse=> begin;
BEGIN
warehouse=*> update journals set name=name where id=18070374;
UPDATE 1
warehouse=*> select id, name, version, action, submitted_date from journals where name='cyaxares' order by id desc limit 10;
id | name | version | action | submitted_date
----------+----------+---------+----------------+----------------------------
18070374 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:36.845118
18070372 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:36.567462
18070371 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:36.386991
18070370 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:36.304009
18070369 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:36.118203
18070368 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:35.790249
18070365 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:27.775461
18070364 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:27.586464
18070363 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:27.414776
18070362 | cyaxares | 0.2 | remove release | 2023-05-11 22:13:27.24229
(10 rows)
warehouse=*> select last_serial from projects where name = 'cyaxares';
last_serial
-------------
18070374
(1 row)
warehouse=*> commit;
COMMIT
Another instance:
warehouse=> select id, name, version, action, submitted_date from journals where name='openmldb-tool' order by id desc limit 10;
id | name | version | action | submitted_date
----------+---------------+---------+---------------------------------------------------+----------------------------
18075139 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:10:13.31279
18075138 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:10:13.088052
18075137 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:10:12.849749
18075135 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:10:09.238179
18075134 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:10:09.200528
18075133 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:10:08.340646
18075132 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:10:07.466855
18075131 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:10:05.803824
18075126 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:09:59.441708
18072976 | openmldb-tool | 0.8.0 | add py3 file openmldb_tool-0.8.0-py3-none-any.whl | 2023-05-12 03:44:07.769301
(10 rows)
warehouse=> select last_serial from projects where name = 'openmldb-tool';
last_serial
-------------
18075137
(1 row)
Resolved:
warehouse=> begin;
BEGIN
warehouse=*> update journals set name=name where id=18075139;
UPDATE 1
warehouse=*> select last_serial from projects where name = 'openmldb-tool';
last_serial
-------------
18075139
(1 row)
warehouse=*> select id, name, version, action, submitted_date from journals where name='openmldb-tool' order by id desc limit 10;
id | name | version | action | submitted_date
----------+---------------+---------+---------------------------------------------------+----------------------------
18075139 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:10:13.31279
18075138 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:10:13.088052
18075137 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:10:12.849749
18075135 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:10:09.238179
18075134 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:10:09.200528
18075133 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:10:08.340646
18075132 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:10:07.466855
18075131 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:10:05.803824
18075126 | openmldb-tool | 0.8.0 | remove release | 2023-05-12 08:09:59.441708
18072976 | openmldb-tool | 0.8.0 | add py3 file openmldb_tool-0.8.0-py3-none-any.whl | 2023-05-12 03:44:07.769301
(10 rows)
warehouse=*> commit;
COMMIT
I think that https://github.com/pypi/warehouse/pull/13936 might fix this.
Maybe another instance on fiftyone-desktop. Waiting for manual intervention.
2023-07-05 23:53:10,292 ERROR: Stale serial for fiftyone-desktop (18769749) not updating. Giving up. (package.py:82)
2023-07-05 23:53:10,292 ERROR: Error syncing package: fiftyone-desktop@18769749 (mirror.py:377)
bandersnatch.master.StalePage: Expected PyPI serial 18769749 for request https://pypi.org/pypi/fiftyone-desktop/json but got 18769748.
@TechCiel Manually resolved that for you. In this instance we had two journals for the same project get issued within a ~2 seconds of each other:
id | name | version | action | submitted_date
----------+------------------+---------+-------------------------------------------------------------------------+----------------------------
18769749 | fiftyone-desktop | 0.28.1 | add py3 file fiftyone_desktop-0.28.1-py3-none-macosx_11_0_arm64.whl | 2023-07-05 15:44:55.417118
18769748 | fiftyone-desktop | 0.28.1 | remove file fiftyone_desktop-0.28.1-py3-none-macosx_11_10_arm64.whl | 2023-07-05 15:44:57.439115
Manually resolved that for you.
Thank you @di , really appreciate that quick respond, but the problem is not resolved. Maybe manual update didn't purge the CDN cache?
> curl 'https://pypi.org/pypi/fiftyone-desktop/json' -I | grep x-pypi-last-serial
x-pypi-last-serial: 18769748
should be 18769749.
@TechCiel Good catch. I've issued a purge and this should now be resolved.
Another package mpf is failing for 40+ hours, and I see no sign of self-recovery
2023-11-05 04:29:24,004 ERROR: Stale serial for package mpf - Attempt 3 (package.py:76)
2023-11-05 04:29:24,005 ERROR: Stale serial for mpf (20485085) not updating. Giving up. (package.py:82)
> curl 'https://pypi.org/pypi/mpf/json' -I | grep x-pypi-last-serial
x-pypi-last-serial: 20485084
Another package
mpfis failing for 40+ hours, and I see no sign of self-recovery
Still.
The mpf issue has been unintentionally resolved with an extra update of package.
Sadly, we now have another issue:
2024-02-05 20:19:26,441 ERROR: Stale serial for proteus-cli (21741396) not updating. Giving up. (package.py:82)
2024-04-16 10:36:39,262 ERROR: Stale serial for tccli (22781647) not updating. Giving up. (package.py:82)
New failure with the proteus-cli one still not resolved.
Is the way to go about this to always just manually come here and request a purge? There is nothing on our client end that can assist with these issues?
bandersnatch.master.StalePage: Expected PyPI serial 14647759 for request https://pypi.org/pypi/pytranslation/json but got 13365137. We can no longer issue a PURGE. Report issue to PyPA Warehouse GitHub if it persists ...
With https://github.com/pypi/warehouse/pull/13936 merged now (along with a few follow up PRs to fix some deadlocks), I think that the primary cause of this has been fixed now.
The tl;dr is that our mirroring relied on the serial to be a monotonically increasing integer, but due to the way PostgreSQL works, concurrent transactions could end up with serials being "out of order", and https://github.com/pypi/warehouse/pull/13936 changes that so that transactions that generate new serial numbers are serialized behind what is effectively a mutex.
I'm going to close this now, but if anyone sees any new reports of this happening after today, we can re-open this issue.
2024-09-24 10:36:00,003 ERROR: Error syncing package: github@25164033
(mirror.py:377)
bandersnatch.master.StalePage: Expected PyPI serial 25164033 for requ
est https://pypi.org/pypi/github/json but got 25138863.
Likely but not sure, another instance of this problem, still ongoing. Could someone look into the database?
cc @dstufft
That's a bug for sure caused by a recent merge that causes journal to proceed but not purge for a project.
Could you help with purging the package while fixing the cause?
Purges are being issued shortly.
Should be cleared now. Thanks for the vigilance on this issue @TechCiel, let us know if any more crop up.
Should be cleared now.
It now shows x-pypi-last-serial: 14353030 ? 🤔 Should it be monotically increasing?
Thanks for the vigilance
That's because I maintain a PyPI mirror, and bandersnatch will fail the whole process in such cases.
Should be cleared now.
It now shows
x-pypi-last-serial: 14353030? 🤔 Should it be monotically increasing?
14353030 is the last valid journal entry for the github project. The bug that was introduced associated an unrelated change with that project.
14353030 is the last valid journal entry for the
githubproject. The bug that was introduced associated an unrelated change with that project.
I think increasing serial is kind of API promise of pypi.org, and bandersnatch is still expecting for 25164033. The XMLRPC changelog API could still be returning this serial.
bandersnatch.master.StalePage: Expected PyPI serial 25483353 for requ
est https://pypi.org/pypi/github/json but got 14353030.
The github package got bumped to a new serial and is failing again.