dcache
dcache copied to clipboard
Webdav failed uploads leave 0 length files in metadata
trafficstars
Hi,
I found many 0 length files in the namespace corresponding of failed uploads (3500 files for Atlas and CMS, 500 for LHCb) Files are well deleted from the pools but not from the namespace :
select * from t_inodes where itype=32768 and isize=0 and icrtime < CURRENT_TIMESTAMP-INTERVAL '24 hours' order by icrtime desc;
ipnfsid | itype | imode | inlink | iuid | igid | isize | iio | ictime | iatime | imtime | icrtime | igeneration | iaccess_latency | iretention_policy | inumber | iqos
_policy | iqos_state
--------------------------------------+-------+-------+--------+------+------+-------+-----+----------------------------+----------------------------+----------------------------+----------------------------+-------------+-----------------+-------------------+------------+-----
--------+------------
0000179591B66B7245B5A7737413750F4F47 | 32768 | 420 | 1 | 3033 | 119 | 0 | 2 | 2024-12-19 06:51:27.989+01 | 2024-12-19 06:51:27.989+01 | 2024-12-19 06:51:27.989+01 | 2024-12-19 06:51:27.989+01 | 0 | | | 1288503194 |
00005DA951F022A3419CAE8F2BE86FFC31B8 | 32768 | 420 | 1 | 3033 | 119 | 0 | 2 | 2024-12-19 06:46:03.511+01 | 2024-12-19 06:46:03.511+01 | 2024-12-19 06:46:03.511+01 | 2024-12-19 06:46:03.511+01 | 0 | | | 1288499582 |
0000843FC0B40E324342889C8B334B9C53F8 | 32768 | 420 | 1 | 3033 | 119 | 0 | 2 | 2024-12-19 06:45:47.725+01 | 2024-12-19 06:45:47.725+01 | 2024-12-19 06:45:47.725+01 | 2024-12-19 06:45:47.725+01 | 0 | | | 1288499339 |
0000C648F112A32E4C9EB0F24E695181872D | 32768 | 420 | 1 | 3327 | 124 | 0 | 2 | 2024-12-18 14:21:26.529+01 | 2024-12-18 14:21:26.529+01 | 2024-12-18 14:21:26.529+01 | 2024-12-18 14:21:26.529+01 | 0 | | | 1288099252 | |
Nothing left in t_location_trash, cleaner is working well.
Failed transfers from the pool logs :
Dec 19 06:51:28 ccdcatli344 dcache@ccdcatli344-pool-cms-hpssdata-li344a-Domain[65368]: 19 Dec 2024 06:51:28 (pool-cms-hpssdata-li344a) [door:webdav-ccdcatli345@webdav-ccdcatli345Domain:AAYpmR8NFLA webdav-ccdcatli345 PoolAcceptFile 0000179591B66B7245B5A7737413750F4F47] Transfer failed: Connection lost before end of file.
Dec 19 06:51:28 ccdcatli344 dcache@ccdcatli344-pool-cms-hpssdata-li344a-Domain[65368]: 19 Dec 2024 06:51:28 (pool-cms-hpssdata-li344a) [door:webdav-ccdcatli345@webdav-ccdcatli345Domain:AAYpmR8NFLA webdav-ccdcatli345 PoolAcceptFile 0000179591B66B7245B5A7737413750F4F47] Transfer failed in post-processing: File size mismatch (expected=497992534, actual=0)
Dec 19 06:51:28 ccdcatli344 dcache@ccdcatli344-pool-cms-hpssdata-li344a-Domain[65368]: 19 Dec 2024 06:51:28 (pool-cms-hpssdata-li344a) [door:webdav-ccdcatli345@webdav-ccdcatli345Domain:AAYpmR8NFLA webdav-ccdcatli345 PoolAcceptFile 0000179591B66B7245B5A7737413750F4F47] Failed to read file size: java.nio.file.NoSuchFileException: /data/pool-cms-hpssdata-li344a/pool/data/0000179591B66B7245B5A7737413750F4F47
Dec 19 06:46:06 ccdcatli415 dcache@ccdcatli415-pool-cms-hpssdata-li415a-Domain[104743]: 19 Dec 2024 06:46:06 (pool-cms-hpssdata-li415a) [door:webdav-ccdcatli345@webdav-ccdcatli345Domain:AAYpmQu1/Dg webdav-ccdcatli345 PoolAcceptFile 00005DA951F022A3419CAE8F2BE86FFC31B8] Transfer failed: Connection lost before end of file.
Dec 19 06:46:06 ccdcatli415 dcache@ccdcatli415-pool-cms-hpssdata-li415a-Domain[104743]: 19 Dec 2024 06:46:06 (pool-cms-hpssdata-li415a) [door:webdav-ccdcatli345@webdav-ccdcatli345Domain:AAYpmQu1/Dg webdav-ccdcatli345 PoolAcceptFile 00005DA951F022A3419CAE8F2BE86FFC31B8] Transfer failed in post-processing: File size mismatch (expected=424527680, actual=0)
Dec 19 06:46:06 ccdcatli415 dcache@ccdcatli415-pool-cms-hpssdata-li415a-Domain[104743]: 19 Dec 2024 06:46:06 (pool-cms-hpssdata-li415a) [door:webdav-ccdcatli345@webdav-ccdcatli345Domain:AAYpmQu1/Dg webdav-ccdcatli345 PoolAcceptFile 00005DA951F022A3419CAE8F2BE86FFC31B8] Failed to read file size: java.nio.file.NoSuchFileException: /data/pool-cms-hpssdata-li415a/pool/data/00005DA951F022A3419CAE8F2BE86FFC31B8
Dec 18 14:30:47 ccdcatli416 dcache@ccdcatli416-pool-atlas-dq2-li416a-Domain[100801]: 18 Dec 2024 14:30:47 (pool-atlas-dq2-li416a) [door:webdav-ccdcatli367@webdav-ccdcatli367Domain:AAYpi0py7lg webdav-ccdcatli367 PoolAcceptFile 0000C648F112A32E4C9EB0F24E695181872D] Transfer failed: No connection from client after 300 seconds. Giving up.
Dec 18 14:30:47 ccdcatli416 dcache@ccdcatli416-pool-atlas-dq2-li416a-Domain[100801]: 18 Dec 2024 14:30:47 (pool-atlas-dq2-li416a) [door:webdav-ccdcatli367@webdav-ccdcatli367Domain:AAYpi0py7lg webdav-ccdcatli367 PoolAcceptFile 0000C648F112A32E4C9EB0F24E695181872D] Transfer failed in post-processing: File size mismatch (expected=692252, actual=0)
Dec 18 14:30:47 ccdcatli416 dcache@ccdcatli416-pool-atlas-dq2-li416a-Domain[100801]: 18 Dec 2024 14:30:47 (pool-atlas-dq2-li416a) [door:webdav-ccdcatli367@webdav-ccdcatli367Domain:AAYpi0py7lg webdav-ccdcatli367 PoolAcceptFile 0000C648F112A32E4C9EB0F24E695181872D] Failed to read file size: java.nio.file.NoSuchFileException: /data/pool-atlas-dq2-li416a/pool/data/0000C648F112A32E4C9EB0F24E695181872D