dcache icon indicating copy to clipboard operation
dcache copied to clipboard

HTTP-TPC doesn’t support percent-encoded paths

Open SarwarAleem opened this issue 1 year ago • 3 comments

The Source or Destination URL may contains characters that are percent encoded (sometimes this is required). Currently, dCache support for HTTP-TPC does not decode any percent-encoded characters in the Source/Destination URL, resulting in transfers failing.

SarwarAleem avatar Feb 16 '24 08:02 SarwarAleem

Here is an example to illustrate the problem.

paul@celebrimbor:~$ curl -H "Authorization: Bearer $(oidc-token EGI-CHECKIN)" -X COPY -H "Credential: none" -H "Source: https://prometheus.desy.de/Video/BlenderFoundation/Tears%20of%20steel.webm" https://dcache-door-doma01.desy.de:2443/Users/paul/test%201
Perf Marker
    Timestamp: 1708331124
    State: Running
    State description: Mover created
    Stripe Index: 0
    Stripe Start Time: 1708331119
    Stripe Last Transferred: 1708331124
    Stripe Transfer Time: 4
    Stripe Bytes Transferred: 462667759
    Stripe Status: RUNNING
    Total Stripe Count: 1
    RemoteConnections: tcp:131.169.5.149:443
End
success: Created
paul@celebrimbor:~$ 

The target is https://dcache-door-doma01.desy.de:2443/Users/paul/test%201. This should have created a file test 1 in my home directory /Users/paul/. However, the above transfer created the file test%201 in that directory.

paulmillar avatar Feb 19 '24 13:02 paulmillar

Just a word of warning. By allowing spaces in file names we are "daring" underlying HSM storage not to choke on them too. So I would like to gauge how important this support for dCache.

DmitryLitvintsev avatar Feb 20 '24 16:02 DmitryLitvintsev

Realistically, fixing this problem is probably not urgent and will probably end up with me fixing the problem (when I get a spare moment) anyway. I wanted this issue mostly to make sure it isn't forgotten.

Specifically on your warning, I believe dCache supports spaces in filenames with other protocols (not 100% sure about xroot, but pretty sure about everything else), so I don't see why dCache shouldn't support spaces with HTTP-TPC.

The tape issue is, perhaps, a concern. WLCG doesn't have files with spaces in it (otherwise we would have noticed this problem earlier), so our major tape users shouldn't have a problem.

As an aside, if you want to see the kind of things dCache supports as filenames, take a look at the UTF-8 test on prometheus. Yes, dCache's namespace works fine with filenames written in Viking runes. I don't know if anyone has tried writing such files to tape, though.

paulmillar avatar Feb 20 '24 22:02 paulmillar

Patch: https://rb.dcache.org/r/14239/

paulmillar avatar Mar 25 '24 10:03 paulmillar