dcache
dcache copied to clipboard
PostgreSQL errors
On two different independent installations of dCache there are these errors on PostgreSQL logs. These are seen at least on dCache 6.2.44 and PostgreSQL 9.5, dCache 6.2.49 and PostgreSQL 10.22, dCache 7.2.19 and PostgreSQL 12.12.
Looks like t_locationinfo_inumber_fkey was introduced in dCache 2.15.
2022-10-27 02:14:40.864 EEST [1859] ERROR: insert or update on table "t_locationinfo" violates foreign key constraint "t_locationinfo_inumber_fkey" 2022-10-27 02:14:40.864 EEST [1859] DETAIL: Key (inumber)=(53190770) is not present in table "t_inodes". 2022-10-27 02:14:40.864 EEST [1859] STATEMENT: INSERT INTO t_locationinfo (inumber,itype,ilocation,ipriority,ictime,iatime,istate) VALUES($1,$2,$3,$4,$5,$6,$7) ON CONFLICT ON CONSTRAINT t_locationinfo_pkey DO NOTHING
Hi,
dCache relays on the db integrity to keep the namespace in a consistent state. Unfortunately, some application level errors are logged by postgres as fatal db errors. The log entry above says that there ware an update of file location information for a file that got removed. As postgressql ON CONFLICT triggered only for unique value violations, those nasty error messages are logged. The alternative is a join on insert, which brings performance penalties.
I don't know much about dCache internals but why process files at all which do not exist? First check if exists and then process?
The issues is that in a concurrent environment as soon as you have checked, the result is not valid anymore, as a different request might delete/create such file.
sequenceDiagram
autonumber
participant Client 1
participant Namespace
participant Client 2
Client 1->>Namespace: File X exists?
Namespace-->>Client 1: Yes
Client 2->>Namespace: Delete File X.
Namespace-->>Client 2: OK.
Client 1->>Namespace: Delete File X.
Namespace-->>Client 1: Error - file doesn't exist.
On diagram above, the check at 1 is not valid anymore at state 5
Thus you have to lock the value (or in our case set an exclusive lock on a table). This will have a negative impact on overall db throughput. Our approach is to relay on DB consistency constrains and avoid locking as much as possible