dcache icon indicating copy to clipboard operation
dcache copied to clipboard

PostgreSQL errors

Open VilleS1 opened this issue 3 years ago • 3 comments

On two different independent installations of dCache there are these errors on PostgreSQL logs. These are seen at least on dCache 6.2.44 and PostgreSQL 9.5, dCache 6.2.49 and PostgreSQL 10.22, dCache 7.2.19 and PostgreSQL 12.12.

Looks like t_locationinfo_inumber_fkey was introduced in dCache 2.15.

2022-10-27 02:14:40.864 EEST [1859] ERROR: insert or update on table "t_locationinfo" violates foreign key constraint "t_locationinfo_inumber_fkey" 2022-10-27 02:14:40.864 EEST [1859] DETAIL: Key (inumber)=(53190770) is not present in table "t_inodes". 2022-10-27 02:14:40.864 EEST [1859] STATEMENT: INSERT INTO t_locationinfo (inumber,itype,ilocation,ipriority,ictime,iatime,istate) VALUES($1,$2,$3,$4,$5,$6,$7) ON CONFLICT ON CONSTRAINT t_locationinfo_pkey DO NOTHING

VilleS1 avatar Oct 27 '22 12:10 VilleS1

Hi,

dCache relays on the db integrity to keep the namespace in a consistent state. Unfortunately, some application level errors are logged by postgres as fatal db errors. The log entry above says that there ware an update of file location information for a file that got removed. As postgressql ON CONFLICT triggered only for unique value violations, those nasty error messages are logged. The alternative is a join on insert, which brings performance penalties.

kofemann avatar Oct 27 '22 12:10 kofemann

I don't know much about dCache internals but why process files at all which do not exist? First check if exists and then process?

VilleS1 avatar Oct 27 '22 13:10 VilleS1

The issues is that in a concurrent environment as soon as you have checked, the result is not valid anymore, as a different request might delete/create such file.


sequenceDiagram
    autonumber

    participant Client 1
    participant Namespace
    participant Client 2


    Client 1->>Namespace: File X exists?
    Namespace-->>Client 1: Yes
    Client 2->>Namespace: Delete File X.
    Namespace-->>Client 2: OK.
    Client 1->>Namespace: Delete File X.
    Namespace-->>Client 1: Error - file doesn't exist.

On diagram above, the check at 1 is not valid anymore at state 5

Thus you have to lock the value (or in our case set an exclusive lock on a table). This will have a negative impact on overall db throughput. Our approach is to relay on DB consistency constrains and avoid locking as much as possible

kofemann avatar Dec 05 '22 09:12 kofemann