dcache icon indicating copy to clipboard operation
dcache copied to clipboard

TAPE REST API and non-default dCache webdav.root

Open ageorget opened this issue 1 year ago • 41 comments
trafficstars

Hi,

Following the discussion with Petr, I open this issue about the usage of the Tape Rest API with a Webdav door configured with a relative webdav.root which fails with 9.2.6.

gfal-stat with the relative path :

gfal-stat davs://ccdavatlas.in2p3.fr:2880/atlasdatatape/SAM/1M                                                                    
  File: 'davs://ccdavatlas.in2p3.fr:2880/atlasdatatape/SAM/1M'
  Size: 1048576	regular file
Access: (0777/-rwxrwxrwx)	Uid: 0	Gid: 0	
Access: 1970-01-01 01:00:00.000000
Modify: 2024-02-01 15:31:33.000000
Change: 2024-02-01 15:31:32.000000

The API response using the relative path :

gfal-xattr davs://ccdavatlas.in2p3.fr:2880/atlasdatatape/SAM/1M user.status                                                                                                                       
gfal-xattr error: 42 (No message of desired type) - [Tape REST API] No such file or directory /atlasdatatape/SAM/1M

and using the full path :

gfal-xattr davs://ccdavatlas.in2p3.fr:2880/pnfs/in2p3.fr/data/atlas/atlasdatatape/SAM/1M user.status                                                                                              
ONLINE_AND_NEARLINE

The Webdav configuration has the webdav.root set up : webdav.root=/pnfs/in2p3.fr/data/atlas/

/var/lib/dcache/httpd/wlcg-tape-rest-api.json

{
  "sitename": "IN2P3-CC",
  "description": "This is the dCache WLCG TAPE REST API endpoint for IN2P3-CC",
  "endpoints":[
      {
        "uri":"https://ccdcamcli06.in2p3.fr:3880/api/v1/tape",
        "version":"v1",
        "metadata": {
        }
      } ]
}

Only the webdav door is configured to use relative path (VO specific). The rest of the dCache conf is using default / root.

/etc/grid-security/storage-authzdb

version 2.1
authorize atlagrid read-write 3327 124 / / /
authorize cmsgrid read-write 3033 119 / / /
authorize lhcbgrid read-write 3437 155 / / /

Frontend logs : Feb 05 10:02:27 ccdcamcli06 dcache@FrontendDomain[138833]: 05 Feb 2024 10:02:27 (frontend) [] getInfo failed for /atlasdatatape/SAM/1M: No such file or directory /atlasdatatape/SAM/1M.

Adrien

ageorget avatar Feb 05 '24 09:02 ageorget

Hi Adrien,

Did you set frontend.root to the same value as webdav.root?

Kind regards, Onno

onnozweers avatar Feb 05 '24 12:02 onnozweers

Hi Onno, No, the frontend is generalist and not dedicated to a VO like the Webdav doors. But I tried to add frontend.root to the same value as webdav.root (and restart) for testing but it does not change anything for this test (but dCache-view displays correct relative path).

Cheers, Adrien

ageorget avatar Feb 05 '24 13:02 ageorget

Hi Adrien,

I think I can conclude that WebDAV supports relative path, but REST API does not. Correct? While we are at it, can you check if namespace resource of frontend suffer from the same issue?

Dmitry

DmitryLitvintsev avatar Feb 05 '24 15:02 DmitryLitvintsev

One thing I suggest to try. Change storage-authzdb so it looks like:

authorize atlagrid read-write 3327 124 / /pnfs/in2p3.fr/data/atlas /
authorize cmsgrid read-write 3033 119 / /pnfs/in2p3.fr/data/cms /
authorize lhcbgrid read-write 3437 155 / /pnfs/in2p3.fr/data/lhcb /

Then set webdav.root=/ frontend.root=/ . Restart doors and retry.

DmitryLitvintsev avatar Feb 05 '24 15:02 DmitryLitvintsev

Hi Dmitry, Well it cannot be done so easily on our side because some protocols like Webdav use basepath and some don't, like SRM and XRootD. So it may needs some changes in CRIC at the same time of changing the dCache conf and this will require a site downtime I guess.

The Webdav door should be able to contact the REST API with the path resolution no? BTW, the 9.2.0 release contains "Path resolution (for relative paths) has been integrated into bulk request processing.". Is it for a different usecase?

ageorget avatar Feb 05 '24 15:02 ageorget

Yes. we were under impression that this was been properly handled. I am just suggesting a mitigation.

I believe that in path defined in storage-authzdb XRootD and WebDAV will work fine (with both full path and relative path). FTP will take only relative path. SRM will take both.

Meanwhile I will be getting to the bottom of your issue. Need to reproduce it on test system.

DmitryLitvintsev avatar Feb 05 '24 15:02 DmitryLitvintsev

But yes, you are right. Since this issue concerns REST API only and this is sort of "experimental" feature, it is safer to wait for proper resolution.

DmitryLitvintsev avatar Feb 05 '24 16:02 DmitryLitvintsev

I agree with that. We just started to test the REST API so we can wait.

And PIC looks also concerned by the same issue according to Petr

gfal-ls https://webdav-at1.pic.es:8466/atlasdatatape/SAM/testfile-put-ATLASDATATAPE-1307712801-484ec9b9cdd6.txt
https://webdav-at1.pic.es:8466/atlasdatatape/SAM/testfile-put-ATLASDATATAPE-1307712801-484ec9b9cdd6.txt

gfal-xattr https://webdav-at1.pic.es:8466/atlasdatatape/SAM/testfile-put-ATLASDATATAPE-1307712801-484ec9b9cdd6.txt user.status
gfal-xattr error: 42 (No message of desired type) - [Tape REST API] No such file or directory /atlasdatatape/SAM/testfile-put-ATLASDATATAPE-1307712801-484ec9b9cdd6.txt

ageorget avatar Feb 05 '24 16:02 ageorget

I think this is a bug that affects everybody who use xxx.root property. NB at Fermilab we don't we prefer to have it defined in /etc/grid-security/storage-authzdb. We will investigate. This is a bit unfortunate because Al definitely tried to address it back in April.

DmitryLitvintsev avatar Feb 06 '24 17:02 DmitryLitvintsev

Actually I am completry confused. Why gfal-xattr uses the TAPE-API endpoint ?! And, yeah, I can confirm the behaviour:


# layout file
webdav.root=/public
frontend.root=${webdav.root}

# 🤷🏼‍♂️ 
$ gfal-xattr http://192.168.178.40:2880/
taperestapi.version = v1
taperestapi.uri = http://192.168.178.40:3880/api/v1/tape
taperestapi.sitename = dcache-systest

# list by prefix ✅ 
$ gfal-ls http://192.168.178.40:2880/
file.txt

# xattr by prefix ⛔ 
$ gfal-xattr http://192.168.178.40:2880/file.txt user.status
gfal-xattr error: 42 (No message of desired type) - [Tape REST API] No such file or directory /file.txt

# xattr by full path ✅ 🤔 
$ gfal-xattr http://192.168.178.40:2880/public/file.txt user.status
NEARLINE

kofemann avatar Feb 08 '24 20:02 kofemann

I think gfal-xattr calls archiveinfo of REST API

While you at it - can you check if any operations in "our API" , say namespace fail with relative path.

If this the case the issue is limited to frontend/REST and hopefully not deeper in bulk.

Conversely if namespace API works this narrows the parameter space.

I think the problem can be recast like - FrontEnd does not support "frontend.root" variable that looks like needs to match "webdav.root" variable.

I can try to look at it tomorrow as well.

DmitryLitvintsev avatar Feb 08 '24 21:02 DmitryLitvintsev

It looks like that in REST requests (tape, bulk, qos?) where path is provided as payload dcache doesn't resolve relative to the configured prefix. We use user provided path as-is. The conversion should happen in both directions: is the request and in the reply:

      for (int i = 0; i < len; ++i) {
          String requestedPath = jsonArray.getString(i);
          String dcachePath = rootPath.chroot(requestedPath).toString();
          paths.add(dcachePath);
      }

...

  return out.stream().map(ai -> {
      var a = new ArchiveInfo();
      a.setError(ai.getError());
      a.setLocality(ai.getLocality());
      a.setPath(FsPath.create(ai.getPath()).stripPrefix(rootPath));
      return a;
  }).toList();

The proposed changes addresses the issue:

$ gfal-xattr http://192.168.178.40:2880/file.txt user.status
NEARLINE

I will check is it possible to make it mo generic and update bulk/qos as well.

kofemann avatar Feb 09 '24 10:02 kofemann

see: https://rb.dcache.org/r/14222/

kofemann avatar Feb 12 '24 08:02 kofemann

Hi @kofemann Do we need to upgrade bulk service and the webdav doors or only the bulk service to get this fix in 9.2.14?

ageorget avatar Mar 12 '24 16:03 ageorget

Only frontend service

kofemann avatar Mar 12 '24 16:03 kofemann

Hi @kofemann I upgraded our dCache core service to 9.2.14 yesterday but gfal-xattr command still returns [Tape REST API] No such file or directory with prefix :

webdav.root=/pnfs/in2p3.fr/data/atlas/

$ gfal-xattr davs://ccdavatlas.in2p3.fr:2880/pnfs/in2p3.fr/data/atlas/atlasdatatape/SAM/1M user.status
NEARLINE
$ gfal-xattr davs://ccdavatlas.in2p3.fr:2880/atlasdatatape/SAM/1M user.status                         
gfal-xattr error: 42 (No message of desired type) - [Tape REST API] No such file or directory /atlasdatatape/SAM/1M

Should I also set frontend.root=${webdav.root} in the webdav door layout to make it works, like in your test?

Adrien

ageorget avatar Mar 13 '24 08:03 ageorget

Hi @ageorget , yes, frontend and webdav config should match, if you want to have consistent behaviour. Thus frontend.root=${webdav.root} should be set.

kofemann avatar Mar 15 '24 09:03 kofemann

Hi @kofemann For now I only have one global frontend and dedicated Webdav doors for VOs with proper prefix (/atlas, /cms, /lhcb). So that means I should now configure one frontend for each VO?

ageorget avatar Mar 15 '24 09:03 ageorget

Do you have a webdav door per experiment?

-kofemann /** caffeinated mutations of the core personality */

On Fri, Mar 15, 2024 at 10:49 AM ageorget @.***> wrote:

Hi @kofemann https://github.com/kofemann For now I only have one global frontend and dedicated Webdav doors for VOs with proper prefix (/atlas, /cms, /lhcb). So that means I should now configure one frontend for each VO?

— Reply to this email directly, view it on GitHub https://github.com/dCache/dcache/issues/7506#issuecomment-1999296336, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEMTXNKNTBHS2JCUFXZ2P3YYK73DAVCNFSM6AAAAABCZYNMECVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJZGI4TMMZTGY . You are receiving this because you were mentioned.Message ID: @.***>

kofemann avatar Mar 15 '24 10:03 kofemann

Yes a webdav door per experiment (with webdav.root configured) and one generalist for wlcg/dteam

ageorget avatar Mar 15 '24 11:03 ageorget

Yes, you will need something similar for frontend, or we need to understand how you can do it with a single frontend, as then you must tight webdav doors to frontends

-kofemann /** caffeinated mutations of the core personality */

On Fri, Mar 15, 2024 at 12:33 PM ageorget @.***> wrote:

Yes a webdav door per experiment (with webdav.root configured) and one generalist for wlcg/dteam

— Reply to this email directly, view it on GitHub https://github.com/dCache/dcache/issues/7506#issuecomment-1999468154, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAEMTXJI7W3V4AOIC2CO6OLYYLMA3AVCNFSM6AAAAABCZYNMECVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJZGQ3DQMJVGQ . You are receiving this because you were mentioned.Message ID: @.***>

kofemann avatar Mar 15 '24 15:03 kofemann

Hi, Any news about this issue?

ageorget avatar Apr 17 '24 14:04 ageorget

Hi @kofemann @DmitryLitvintsev

I set up a dedicated frontend service for CMS matching frontend.root=${webdav.root} but I'm facing the same issue. All requests are failing with No such file or directory because the prefix is not used by the bulk service

(ERROR: diskCacheV111.util.FileNotFoundCacheException : CacheException(rc=10001;msg=No such file or directory /data/store/test/rucio/store/test/loadtest/source/T1_FR_CCIN2P3_Tape_Test/urandom.270MB.file0000))

The webdav conf :

[webdav-ccdcacli537Domain]
[webdav-ccdcacli537Domain/webdav]
webdav.root=/pnfs/in2p3.fr/data/cms/
webdav.authn.protocol=https

wlcg-tape-rest-api.json targets the dedicated frontend with :

[Frontend2Domain]
[Frontend2Domain/frontend]
frontend.authn.basic=true
frontend.authn.protocol=https
frontend.root=/pnfs/in2p3.fr/data/cms/

gfal-xattr is OK :

gfal-xattr -vv davs://ccdavcms.in2p3.fr:2880/data/store/test/rucio/store/hidata/HIRun2023A/HIPhysicsRawPrime19/MINIAOD/PromptReco-v2/000/375/013/00000/011f9c67-2e16-45f9-a832-d4ad1e834fe0.root user.status 

INFO     Davix: > GET /.well-known/wlcg-tape-rest-api HTTP/1.1
> User-Agent: gfal2-util/1.8.1 gfal2/2.22.2 neon/0.0.29
> Keep-Alive: 
> Connection: Keep-Alive
> TE: trailers
> Host: ccdavcms.in2p3.fr:2880
> 

INFO     Davix: < HTTP/1.1 200 OK
INFO     Davix: < Date: Mon, 29 Apr 2024 08:48:31 GMT
INFO     Davix: < Server: dCache/9.2.14
INFO     Davix: < Content-Type: application/json;charset=utf-8
INFO     Davix: < Transfer-Encoding: chunked
INFO     Davix: < 
INFO     Davix: < 
INFO     Davix: > POST /api/v1/tape/archiveinfo HTTP/1.1
> User-Agent: gfal2-util/1.8.1 gfal2/2.22.2 neon/0.0.29
> Keep-Alive: 
> Connection: Keep-Alive
> TE: trailers
> Host: ccdcamcli07.in2p3.fr:3880
> Content-Type: application/json
> Content-Length: 163
> 

INFO     Davix: < HTTP/1.1 200 OK
INFO     Davix: < Date: Mon, 29 Apr 2024 08:48:31 GMT
INFO     Davix: < Server: dCache/9.2.14
INFO     Davix: < Content-Type: application/json
INFO     Davix: < Content-Length: 179
INFO     Davix: < 
NEARLINE

But if I try to stage the file it fails :

cat stage2.json 
{
"files": [
{"path": "/data/store/test/rucio/store/hidata/HIRun2023A/HIPhysicsRawPrime19/MINIAOD/PromptReco-v2/000/375/013/00000/011f9c67-2e16-45f9-a832-d4ad1e834fe0.root","diskLifetime":"PT1H"}
]}

curl -v --capath /etc/grid-security/certificates --cacert $X509_USER_PROXY --cert $X509_USER_PROXY -X POST "https://ccdcamcli07.in2p3.fr:3880/api/v1/tape/stage" -H  "accept: application/json" -H  "content-type: application/json" -d @stage2.json       
* Connected to ccdcamcli07.in2p3.fr (2001:660:5009:1:134:158:109:247) port 3880 (#0)
> POST /api/v1/tape/stage HTTP/1.1
> User-Agent: curl/7.29.0
> Host: ccdcamcli07.in2p3.fr:3880
> accept: application/json
> content-type: application/json
> Content-Length: 195
> 
* upload completely sent off: 195 out of 195 bytes
< HTTP/1.1 201 Created
< Date: Mon, 29 Apr 2024 08:50:40 GMT
< Server: dCache/9.2.14
< Location: https://ccdcamcli07.in2p3.fr:3880/api/v1/tape/stage/9c4abd96-087b-477c-b67a-62c8fb467cd5
< Content-Type: application/json
< Content-Length: 58
< 
{
  "requestId" : "9c4abd96-087b-477c-b67a-62c8fb467cd5"
* Connection #0 to host ccdcamcli07.in2p3.fr left intact
}

curl --capath /etc/grid-security/certificates --cacert $X509_USER_PROXY --cert $X509_USER_PROXY -X GET "https://ccdcamcli07.in2p3.fr:3880/api/v1/tape/stage/9c4abd96-087b-477c-b67a-62c8fb467cd5" -H  "accept: application/json" 
{
  "id" : "9c4abd96-087b-477c-b67a-62c8fb467cd5",
  "createdAt" : 1714380640973,
  "startedAt" : 1714380641016,
  "completedAt" : 1714380641071,
  "files" : [ {
    "path" : "/data/store/test/rucio/store/hidata/HIRun2023A/HIPhysicsRawPrime19/MINIAOD/PromptReco-v2/000/375/013/00000/011f9c67-2e16-45f9-a832-d4ad1e834fe0.root",
    "finishedAt" : 1714380640979,
    "startedAt" : 1714380640979,
    "error" : "CacheException(rc=10001;msg=No such file or directory /data/store/test/rucio/store/hidata/HIRun2023A/HIPhysicsRawPrime19/MINIAOD/PromptReco-v2/000/375/013/00000/011f9c67-2e16-45f9-a832-d4ad1e834fe0.root)",
    "state" : "FAILED"
  } ]
}%

In the bulk log : Apr 29 10:50:41 ccdcamcli06 dcache@bulkDomain[49218]: 29 Apr 2024 10:50:41 (bulk) [] 9c4abd96-087b-477c-b67a-62c8fb467cd5 - fetchAttributes, callback failure for TARGET [241299, INITIAL, null][null][CREATED: (C 2024-04-29 10:50:40.979)(S null)(U 2024-04-29 10:50:40.979)(ret 0)][null] null : /data/store/test/rucio/store/hidata/HIRun2023A/HIPhysicsRawPrime19/MINIAOD/PromptReco-v2/000/375/013/00000/011f9c67-2e16-45f9-a832-d4ad1e834fe0.root (err null null).

(bulk@bulkDomain) ageorget > request info 9c4abd96-087b-477c-b67a-62c8fb467cd5
9c4abd96-087b-477c-b67a-62c8fb467cd5:
status:           COMPLETED
arrived at:       2024-04-29 10:50:40.973
started at:       2024-04-29 10:50:41.013
last modified at: 2024-04-29 10:50:41.07
target prefix:    /
targets:
CREATED                   |                   STARTED |                 COMPLETED |        STATE | TARGET
2024-04-29 10:50:40.979   |   2024-04-29 10:50:40.979 |   2024-04-29 10:50:40.979 |       FAILED | /data/store/test/rucio/store/hidata/HIRun2023A/HIPhysicsRawPrime19/MINIAOD/PromptReco-v2/000/375/013/00000/011f9c67-2e16-45f9-a832-d4ad1e834fe0.root -- (ERROR: diskCacheV111.util.FileNotFoundCacheException : CacheException(rc=10001;msg=No such file or directory /data/store/test/rucio/store/hidata/HIRun2023A/HIPhysicsRawPrime19/MINIAOD/PromptReco-v2/000/375/013/00000/011f9c67-2e16-45f9-a832-d4ad1e834fe0.root))

ageorget avatar Apr 29 '24 08:04 ageorget

Hi, @ageorget. Could you please add the full path name to the issue so, that we can see to which path it should be resolved to?

kofemann avatar Apr 29 '24 09:04 kofemann

The full path for the last example is -rw-r--r-- 1 cmsgrid cmsf 4.0G Apr 26 15:11 /pnfs/in2p3.fr/data/cms/data/store/test/rucio/store/hidata/HIRun2023A/HIPhysicsRawPrime19/MINIAOD/PromptReco-v2/000/375/013/00000/011f9c67-2e16-45f9-a832-d4ad1e834fe0.root

So prefix + path: /pnfs/in2p3.fr/data/cms + /data/store/test/rucio/store/hidata/HIRun2023A/HIPhysicsRawPrime19/MINIAOD/PromptReco-v2/000/375/013/00000/011f9c67-2e16-45f9-a832-d4ad1e834fe0.root

ageorget avatar Apr 29 '24 10:04 ageorget

Can someone from the dCache team provide an update where things stand? (CMS needs this fixed to complete the SRMv2 phase out.) Thanks,

  • Stephan

stlammel avatar May 24 '24 15:05 stlammel

on CMS T1 system: @ageorget @kofemann @stlammel

It works for us. On CMS T1 tape:

frontend.root = /pnfs/fs/usr/cms
webdav.root = /pnfs/fs/usr/cms

storage-authzdb:

authorize cmsprod read-write 9811 5063,9114,9247 / /pnfs/fs/usr/cms /pnfs/fs/usr/cms

check archiveinfo:

$ curl  --capath /etc/grid-security/certificates --cert /tmp/x509up_u`id -u` --cacert /tmp/x509up_u`id -u` --key  /tmp/x509up_u`id -u` -X POST https://cmsdcatape.fnal.gov:3880/api/v1/tape/archiveinfo -H "Content-Type: application/json" -d '{"paths"  : ["/WAX/11/store/test/rucio/store/express/Run2023C/StreamALCAPPSExpress/ALCARECO/PPSCal\MaxTracks-Express-v4/000/368/382/00000/271228d6-6080-40f1-ad54-72ec4c6da394.root"]}'

reply :

[{"path":"/WAX/11/store/test/rucio/store/express/Run2023C/StreamALCAPPSExpress/ALCARECO/PPSCalMaxTracks-Express-v4/000/368/382/00000/271228d6-6080-40f1-ad54-72ec4c6da394.root","locality":"TAPE"}]

bring it online:

$ gfal-bringonline https://cmsdcatape.fnal.gov:3880/WAX/11/store/test/rucio/store/express/Run2023C/Strea\
mALCAPPSExpress/ALCARECO/PPSCalMaxTracks-Express-v4/000/368/382/00000/271228d6-6080-40f1-ad54-72ec4c6da394.root

reply:

https://cmsdcatape.fnal.gov:3880/WAX/11/store/test/rucio/store/express/Run2023C/StreamALCAPPSExpress/ALCARECO/PPSCalMaxTracks-Express-v4/000/368/382/00000/271228d6-6080-40f1-ad54-72ec4c6da394.root QUEUED

in bulk I see:

[cmsdcatape02new] (bulk@bulkDomain) admin > request ls 
ID           | ARRIVED             |            MODIFIED |        OWNER |     STATUS | UID
6240         | 2024/05/24-15:39:13 | 2024/05/24-15:39:14 |    9811:5063 |    STARTED | 31c38979-d1f2-4ac2-883c-0b00458f206d
[cmsdcatape02new] (bulk@bulkDomain) admin > request info 31c38979-d1f2-4ac2-883c-0b00458f206d
31c38979-d1f2-4ac2-883c-0b00458f206d:
status:           STARTED
arrived at:       2024-05-24 15:39:13.697
started at:       2024-05-24 15:39:14.336
last modified at: 2024-05-24 15:39:14.336
target prefix:    /pnfs/fs/usr/cms
targets:
CREATED                   |                   STARTED |                 COMPLETED |        STATE | TARGET
2024-05-24 15:39:13.92    |    2024-05-24 15:39:13.92 |                         ? |      RUNNING | /WAX/11/store/test/rucio\
/store/express/Run2023C/StreamALCAPPSExpress/ALCARECO/PPSCalMaxTracks-Express-v4/000/368/382/00000/271228d6-6080-40f1-ad54-\
72ec4c6da394.root

and after while

[cmsdcatape02new] (bulk@bulkDomain) admin > request info 31c38979-d1f2-4ac2-883c-0b00458f206d
31c38979-d1f2-4ac2-883c-0b00458f206d:
status:           COMPLETED
arrived at:       2024-05-24 15:39:13.697
started at:       2024-05-24 15:39:14.336
last modified at: 2024-05-24 15:42:18.336
target prefix:    /pnfs/fs/usr/cms
targets:
CREATED                   |                   STARTED |                 COMPLETED |        STATE | TARGET
2024-05-24 15:39:13.92    |    2024-05-24 15:39:13.92 |   2024-05-24 15:42:18.324 |    COMPLETED | /WAX/11/store/test/rucio/store/express/Run2023C/StreamALCAPPSExpress/ALCARECO/PPSCalMaxTracks-Express-v4/000/368/382/00000/271228d6-6080-40f1-ad54-72ec4c6da394.root

And checking archiveinfo again:

$ curl  --capath /etc/grid-security/certificates --cert /tmp/x509up_u`id -u` --cacert /tmp/x509up_u`id -u` --key  /tmp/x509up_u`id -u` -X POST https://cmsdcatape.fnal.gov:3880/api/v1/tape/archiveinfo -H "Content-Type: application/json" -d '{"paths"  : ["/WAX/11/store/test/rucio/store/express/Run2023C/StreamALCAPPSExpress/ALCARECO/PPSCalMaxTracks-Express-v4/000/368/382/00000/271228d6-6080-40f1-ad54-72ec4c6da394.root"]}'

reply:

[{"path":"/WAX/11/store/test/rucio/store/express/Run2023C/StreamALCAPPSExpress/ALCARECO/PPSCalMaxTracks-Express-v4/000/368/382/00000/271228d6-6080-40f1-ad54-72ec4c6da394.root","locality":"DISK_AND_TAPE"}]

As you can see works as designed for us. The real full path of the file is:

# ls -al /pnfs/fs/usr/cms/WAX/11/store/test/rucio/store/express/Run2023C/StreamALCAPPSExpress/ALCARECO/PPSCalMaxTracks-Express-v4/000/368/382/00000/271228d6-6080-40f1-ad54-72ec4c6da394.root
-rw-r--r-- 1 9811 5063 43073478 Feb 27 04:12 /pnfs/fs/usr/cms/WAX/11/store/test/rucio/store/express/Run2023C/StreamALCAPPSExpress/ALCARECO/PPSCalMaxTracks-Express-v4/000/368/382/00000/271228d6-6080-40f1-ad54-72ec4c6da394.root

@ageorget the frontend.root in your case is real file path or symlink? (just fishing for possible differences)

Dmitry

DmitryLitvintsev avatar May 24 '24 20:05 DmitryLitvintsev

Thanks Dmitry! @DmitryLitvintsev Let's try to track down the Fermilab/In2P3 difference and see if we can overcome this.

  • Stephan

stlammel avatar May 24 '24 22:05 stlammel

Hi @DmitryLitvintsev

The frontend.root is a real file path, not a symlink. The only difference I see in our dCache conf is the storage-authzdb which is set to the root path :

authorize atlagrid read-write 3327 124 / / /
authorize cmsgrid read-write 3033 119 / / /

If we need to set basepath in the storage-authzdb, we will have to coordinates the changes with CRIC because as I said, currently some protocols like Webdav use basepath and some don't, like SRM and XRootD (redirector, local access).

Same tests you mentioned using the Atlas frontend (9.2.14 with frontend.root=/pnfs/in2p3.fr/data/atlas ) :

check archiveinfo OK :

curl  --capath /etc/grid-security/certificates --cert /tmp/x509up_u`id -u` --cacert /tmp/x509up_u`id -u` --key  /tmp/x509up_u`id -u` -X POST https://ccdcamcli08.in2p3.fr:3880/api/v1/tape/archiveinfo -H "Content-Type: application/json" -d '{"paths"  : ["/atlasmctape/mc16_13TeV/HITS/e8351_s3126/mc16_13TeV.700337.Sh_2211_Znunu_pTV2_CVetoBVeto.simul.HITS.e8351_s3126_tid30364865_00/HITS.30364865._017868.pool.root.1"]}' 
[{"path":"/atlasmctape/mc16_13TeV/HITS/e8351_s3126/mc16_13TeV.700337.Sh_2211_Znunu_pTV2_CVetoBVeto.simul.HITS.e8351_s3126_tid30364865_00/HITS.30364865._017868.pool.root.1","locality":"TAPE"}]

bring it online OK :

gfal-bringonline https://ccdcamcli08.in2p3.fr:3880/atlasmctape/mc16_13TeV/HITS/e8351_s3126/mc16_13TeV.700337.Sh_2211_Znunu_pTV2_CVetoBVeto.simul.HITS.e8351_s3126_tid30364865_00/HITS.30364865._017868.pool.root.1
https://ccdcamcli08.in2p3.fr:3880/atlasmctape/mc16_13TeV/HITS/e8351_s3126/mc16_13TeV.700337.Sh_2211_Znunu_pTV2_CVetoBVeto.simul.HITS.e8351_s3126_tid30364865_00/HITS.30364865._017868.pool.root.1 QUEUED

in bulk, request failed :

request info 776cec02-be3b-4222-98e9-1cb7d000ea15
776cec02-be3b-4222-98e9-1cb7d000ea15:
status:           COMPLETED
arrived at:       2024-05-27 13:59:32.691
started at:       2024-05-27 13:59:32.705
last modified at: 2024-05-27 13:59:32.724
target prefix:    /
targets:
CREATED                   |                   STARTED |                 COMPLETED |        STATE | TARGET
2024-05-27 13:59:32.695   |   2024-05-27 13:59:32.695 |   2024-05-27 13:59:32.695 |       FAILED | /atlasmctape/mc16_13TeV/HITS/e8351_s3126/mc16_13TeV.700337.Sh_2211_Znunu_pTV2_CVetoBVeto.simul.HITS.e8351_s3126_tid30364865_00/HITS.30364865._017868.pool.root.1 -- (ERROR: diskCacheV111.util.FileNotFoundCacheException : CacheException(rc=10001;msg=No such file or directory /atlasmctape/mc16_13TeV/HITS/e8351_s3126/mc16_13TeV.700337.Sh_2211_Znunu_pTV2_CVetoBVeto.simul.HITS.e8351_s3126_tid30364865_00/HITS.30364865._017868.pool.root.1))

ageorget avatar May 27 '24 08:05 ageorget

Well, I tried to change storage-authzdb as @DmitryLitvintsev suggested adding the basepath :

authorize atlagrid read-write 3327 124 / /pnfs/in2p3.fr/data/atlas /
authorize cmsgrid read-write 3033 119 / /pnfs/in2p3.fr/data/cms /

And this was enough to make staging work using prefix :

request info 6986f840-2cb9-4bfa-87c8-b416643de233
6986f840-2cb9-4bfa-87c8-b416643de233:
status:           COMPLETED
arrived at:       2024-05-30 10:26:53.244
started at:       2024-05-30 10:26:53.261
last modified at: 2024-05-30 10:32:16.596
target prefix:    /pnfs/in2p3.fr/data/cms
targets:
CREATED                   |                   STARTED |                 COMPLETED |        STATE | TARGET
2024-05-30 10:26:53.249   |   2024-05-30 10:26:53.249 |   2024-05-30 10:32:16.586 |    COMPLETED | /data/store/test/rucio/store/hidata/HIRun2023A/HIPhysicsRawPrime19/MINIAOD/PromptReco-v2/000/375/013/00000/011f9c67-2e16-45f9-a832-d4ad1e834fe0.root
request info 417a5959-c3ee-43e8-aafc-d7b5a3f3e5af
417a5959-c3ee-43e8-aafc-d7b5a3f3e5af:
status:           COMPLETED
arrived at:       2024-05-30 11:42:14.699
started at:       2024-05-30 11:42:14.712
last modified at: 2024-05-30 11:48:39.733
target prefix:    /pnfs/in2p3.fr/data/atlas
targets:
CREATED                   |                   STARTED |                 COMPLETED |        STATE | TARGET
2024-05-30 11:42:14.703   |   2024-05-30 11:42:14.703 |   2024-05-30 11:48:39.729 |    COMPLETED | /atlasmctape/mc16_13TeV/HITS/e8351_s3126/mc16_13TeV.700337.Sh_2211_Znunu_pTV2_CVetoBVeto.simul.HITS.e8351_s3126_tid30364865_00/HITS.30364865._017868.pool.root.1

But now I see transfers errors with /upload which moves from / to /pnfs/in2p3.fr/data/atlas I think this change should be coordinate with generalist SRM+HTTPS Webdav doors which used webdav.root=/

So using prefix set in storage-authzdb allowed bulk service to get target prefix: /pnfs/in2p3.fr/data/atlas configured. Is there a way to make it compatible with frontend.root parameter instead of being forced to change the main dCache configuration?

ageorget avatar May 30 '24 08:05 ageorget