Dmitry Litvintsev
Dmitry Litvintsev
Further investigation tells me that PoolManager was restarted on Oct 29th. This could explain what happened.
The scope could be massive. Say you have 100K restore requests queued on the pools. You restart PoolManager and user start pre-staging the same files. If I understand correctly this...
Correct me if I am mistaken here. This my estimate of the effect of PoolManager restart: ``` billing=# with foo as (select count(*) as counter, pnfsid from storageinfo where action='restore'...
yes they do correlate with poolmanager restarts. They come in groups (user agents pre-staging data by dataset)
``` [litvinse@litvintsev dcache]$ git bisect good 05524104505f85288ef0dc3d86a383600a54af34 is the first bad commit commit 05524104505f85288ef0dc3d86a383600a54af34 Author: Marina Sahakyan Date: Thu Apr 14 18:47:34 2022 +0200 dcahce: set https redirect to true...
May be this is my issue - I did not upgrade dCache on all the pools as well?
I migrated file to another pool, same story. And yes, I can use other protocols: ``` [root@mu2ebuild01 litvinse]# dccp dcap://fndca1:24136/pnfs/fnal.gov/usr/mu2e/persistent/users/mu2epro/valjob/reco_031021/dig.brownd.CeEndpointMixTriggered.MDC2020k.001210_00000000.art . 2064392593 bytes (1.92 GiB) in 27 seconds (72.9 MiB/s)...
I do not know exact options passed to globus-url-copy in the quoted case. It is likely mode E, data is not proxied by door. Coincidentally I am using TPC to...
I attach here door access log and billing log for one of the tranfer,. Client logic is : ``` rc = 1 tries = 0 while rc != 0 and...
The source file has normal size: ``` [root@dcatest01 ~]# ls -alt /pnfs/fnal.gov/usr/d0/data/db5/derivedDetector/csg-p21.20.00-p20.18.02b/dzero/thumbnail/lep/0000/CSskim-NP-20101123-010613-6113518_p21.20.00 -rw-r--r-- 1 1841 1507 936499002 Nov 23 2010 /pnfs/fnal.gov/usr/d0/data/db5/derivedDetector/csg-p21.20.00-p20.18.02b/dzero/thumbnail/lep/0000/CSskim-NP-20101123-010613-6113518_p21.20.00 [root@dcatest01 ~]# ```