dcache icon indicating copy to clipboard operation
dcache copied to clipboard

Mover mixes expected and calculated checksums

Open paulmillar opened this issue 3 years ago • 1 comments
trafficstars

The mover / ReplicaRecord stores expected checksum value about an incoming file using FileAttributes. The same FileAttributes object is updated by adding any known checksum values. The result is then sent to PnfsManager.

This will work if all known checksums are verified by comparing them with checksums calculated using each of the known-checksum algorithms from the received data.

However, it is a bad design.

As any failure to verify a known checksum will result in it being accepted by dCache as valid. This could lead to dCache accepting corrupt data.

A better design would have a clear separation between expected checksums and actual checksums. Only the actual checksums are sent to PnfsManager. This ensures that the namespace has only checksums that describe the data stored on the pool.

paulmillar avatar May 13 '22 18:05 paulmillar

Note that doors will push known checksums into the namespace. Therefore this problem is more involved that simply fixing pools.

To create a clear separation between expected and actual checksums, the doors should be updated to send expected checksum values to the pool directly (e.g., via the ProtocolInfo object). The pool would then be able to keep these checksums separate from those representing the file (in the namespace).

paulmillar avatar Jul 11 '22 12:07 paulmillar