dcache icon indicating copy to clipboard operation
dcache copied to clipboard

Bulk release requests are not working if using relative path

Open ageorget opened this issue 1 year ago • 4 comments
trafficstars

Hi,

I found that release process is not working when the release is using relative path (without prefix) and this could explain why our Atlas staging buffer is full most of the time.

To reproduce it, I send a staging request of this file /atlasmctape/mc16_13TeV/HITS/e8351_s3126/mc16_13TeV.700337.Sh_2211_Znunu_pTV2_CVetoBVeto.simul.HITS.e8351_s3126_tid30364865_00/HITS.30364865._017868.pool.root.1

cat stageAtlas.json
{
"files": [
{"path": "/atlasmctape/mc16_13TeV/HITS/e8351_s3126/mc16_13TeV.700337.Sh_2211_Znunu_pTV2_CVetoBVeto.simul.HITS.e8351_s3126_tid30364865_00/HITS.30364865._017868.pool.root.1","diskLifetime":"PT1H"}
]
}

curl --capath /etc/grid-security/certificates --cacert $X509_USER_PROXY --cert $X509_USER_PROXY -X POST "https://ccdcamcli08.in2p3.fr:3880/api/v1/tape/stage" -H  "accept: application/json" -H  "content-type: application/json" -d @stageAtlas.json
{
  "requestId" : "1b72f21e-d66a-4af7-a784-6178a3c3a35c"
}%        

level=INFO ts=2024-08-12T16:07:39.771+0200 event=org.dcache.frontend.request request.method=POST request.url=https://ccdcamcli08.in2p3.fr:3880/api/v1/tape/stage response.code=201 response.reason=Created location=https://ccdcamcli08.in2p3.fr:3880/api/v1/tape/stage/1b72f21e-d66a-4af7-a784-6178a3c3a35c socket.remote=[2001:660:5009:84:134:158:239:7]:35504 user-agent=curl/7.29.0 user.dn="CN=1855496286,CN=GEORGET Adrien [email protected],O=Centre national de la recherche scientifique,C=FR,DC=tcs,DC=terena,DC=org" user.mapped=3327:124 request.entity="{\"files\":[{\"path\"[...]fetime\":\"PT1H\"}]}" response.entity="{\n  \"requestId\" : \"1b72f21e-d66a-4a[...]" duration=15

Staging is OK and file is pinned on disk cache :

\s pool-atlas-read-li425a rep sticky ls 000098FBFE5589274CABB284DA5BBB379C4B
self : expires 8/12/24, 4:12 PM
PinManager-0649a68f-2bc8-48e6-8138-c40d0b4bf130 : expires 8/14/24, 4:37 PM

Then I release the file using his relative path :

archiveinfo.json 
{
"paths": ["/atlasmctape/mc16_13TeV/HITS/e8351_s3126/mc16_13TeV.700337.Sh_2211_Znunu_pTV2_CVetoBVeto.simul.HITS.e8351_s3126_tid30364865_00/HITS.30364865._017868.pool.root.1"]
}

curl --capath /etc/grid-security/certificates --cacert $X509_USER_PROXY --cert $X509_USER_PROXY -X POST "https://ccdcamcli08.in2p3.fr:3880/api/v1/tape/release/1b72f21e-d66a-4af7-a784-6178a3c3a35c" -H  "accept: application/json" -H  "content-type: application/json" -d @archiveinfo.json

level=INFO ts=2024-08-12T16:10:47.568+0200 event=org.dcache.frontend.request request.method=POST request.url=https://ccdcamcli08.in2p3.fr:3880/api/v1/tape/release/1b72f21e-d66a-4af7-a784-6178a3c3a35c response.code=200 response.reason=OK socket.remote=[2001:660:5009:84:134:158:239:7]:35512 user-agent=curl/7.29.0 user.dn="CN=1855496286,CN=GEORGET Adrien [email protected],O=Centre national de la recherche scientifique,C=FR,DC=tcs,DC=terena,DC=org" user.mapped=3327:124 request.entity="{\"paths\":[\"/atlas[...]68.pool.root.1\"]}" duration=11

After 30min, pin is always active :

\s pool-atlas-read-li425a rep sticky ls 000098FBFE5589274CABB284DA5BBB379C4B
PinManager-0649a68f-2bc8-48e6-8138-c40d0b4bf130 : expires 8/14/24, 4:37 PM

And if I try to release the file using his full path, the file is instantly unpin from the disk :

cat archiveinfo.json
{
"paths": ["/pnfs/in2p3.fr/data/atlas/atlasmctape/mc16_13TeV/HITS/e8351_s3126/mc16_13TeV.700337.Sh_2211_Znunu_pTV2_CVetoBVeto.simul.HITS.e8351_s3126_tid30364865_00/HITS.30364865._017868.pool.root.1"]
}

[16:17]:curl --capath /etc/grid-security/certificates --cacert $X509_USER_PROXY --cert $X509_USER_PROXY -X POST "https://ccdcamcli08.in2p3.fr:3880/api/v1/tape/release/1b72f21e-d66a-4af7-a784-6178a3c3a35c" -H  "accept: application/json" -H  "content-type: application/json" -d @archiveinfo.json

level=INFO ts=2024-08-12T16:17:19.146+0200 event=org.dcache.frontend.request request.method=POST request.url=https://ccdcamcli08.in2p3.fr:3880/api/v1/tape/release/1b72f21e-d66a-4af7-a784-6178a3c3a35c response.code=200 response.reason=OK socket.remote=[2001:660:5009:84:134:158:239:7]:35522 user-agent=curl/7.29.0 user.dn="CN=1855496286,CN=GEORGET Adrien [email protected],O=Centre national de la recherche scientifique,C=FR,DC=tcs,DC=terena,DC=org" user.mapped=3327:124 request.entity="{\"paths\":[\"/pnfs/[...]68.pool.root.1\"]}" duration=27

In PinManager : Aug 12 16:17:20 ccdcamcli08 dcache@PinManagerDomain[129736]: 12 Aug 2024 16:17:20 (PinManager) [BackgroundUnpinner-201460] Unpining [955776409] 000098FBFE5589274CABB284DA5BBB379C4B (1b72f21e-d66a-4af7-a784-6178a3c3a35c) by 3327:124 2024-08-12 16:07:39 to 2024-08-14 16:07:45 is READY_TO_UNPIN on pool-atlas-read-li425a:PinManager-0649a68f-2bc8-48e6-8138-c40d0b4bf130

[ccdcamcli06] (bulk@bulkDomain) ageorget > \s pool-atlas-read-li425a rep sticky ls 000098FBFE5589274CABB284DA5BBB379C4B
[ccdcamcli06] (bulk@bulkDomain) ageorget > 

Bulk service also doesn't report when a release request is not done. Can you check this please?

Adrien

ageorget avatar Aug 12 '24 14:08 ageorget