dcache
dcache copied to clipboard
7.2 (and trunk) - NPE on removal via WebDAV and token
when running this command:
curl -f -k -L -H "Authorization: Bearer ${TOKEN}" -X DELETE https://fndcaitb4.fnal.gov:2880${spath}/junk.data
I get this error
curl: (22) NSS: client certificate not found (nickname not specified)
The file actually gets removed and I see a NPE in the door pinboard:
16 Sep 2022 15:27:42 [jetty-59] [door:WebDAV-fndcaitb4-1@webdavDomain:AAXo0TGqBGA] DELETE :: fndcaitb4.fnal.gov:2880///mu2e/scratch/users/litvinse/junk.data start
16 Sep 2022 15:27:42 [jetty-59] [door:WebDAV-fndcaitb4-1@webdavDomain:AAXo0TGqBGA] Internal server errorjava.lang.NullPointerException: null
at org.dcache.notification.DoorRequestMessageSerializer.serialize(DoorRequestMessageSerializer.java:77)
at org.dcache.notification.DoorRequestMessageSerializer.serialize(DoorRequestMessageSerializer.java:18)
at org.apache.kafka.common.serialization.Serializer.serialize(Serializer.java:62)
at org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:945)
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:905)
at org.springframework.kafka.core.DefaultKafkaProducerFactory$CloseSafeProducer.send(DefaultKafkaProducerFactory.java:993)
at org.springframework.kafka.core.KafkaTemplate.doSend(KafkaTemplate.java:649)
at org.springframework.kafka.core.KafkaTemplate.send(KafkaTemplate.java:403)
at org.springframework.kafka.core.KafkaTemplate.sendDefault(KafkaTemplate.java:382)
at org.dcache.webdav.DcacheResourceFactory.sendRemoveInfoToBilling(DcacheResourceFactory.java:1098)
at org.dcache.webdav.DcacheResourceFactory.deleteFile(DcacheResourceFactory.java:1083)
at org.dcache.webdav.DcacheFileResource.delete(DcacheFileResource.java:190)
at io.milton.http.DeleteHelperImpl.delete(DeleteHelperImpl.java:126)
at io.milton.http.http11.DeleteHandler.processExistingResource(DeleteHandler.java:98)
at io.milton.http.ResourceHandlerHelper.processResource(ResourceHandlerHelper.java:196)
at io.milton.http.ResourceHandlerHelper.processResource(ResourceHandlerHelper.java:131)
at io.milton.http.http11.DeleteHandler.processResource(DeleteHandler.java:83)
at io.milton.http.ResourceHandlerHelper.process(ResourceHandlerHelper.java:127)
at org.dcache.webdav.DcacheResourceHandlerHelper.process(DcacheResourceHandlerHelper.java:42)
at io.milton.http.http11.DeleteHandler.process(DeleteHandler.java:72)
at org.dcache.webdav.DcacheStandardFilter.process(DcacheStandardFilter.java:50)
at io.milton.http.FilterChain.process(FilterChain.java:46)
at org.dcache.webdav.transfer.CopyFilter.process(CopyFilter.java:276)
at io.milton.http.FilterChain.process(FilterChain.java:46)
at io.milton.http.HttpManager.process(HttpManager.java:158)
at org.dcache.webdav.MiltonHandler.handle(MiltonHandler.java:77)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:59)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.dcache.http.AuthenticationHandler.access$001(AuthenticationHandler.java:55)
at org.dcache.http.AuthenticationHandler.lambda$handle$0(AuthenticationHandler.java:156)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:361)
at org.dcache.http.AuthenticationHandler.handle(AuthenticationHandler.java:153)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:59)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.dcache.http.AbstractLoggingHandler.handle(AbstractLoggingHandler.java:110)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:59)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.Server.handle(Server.java:516)
at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:400)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:645)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:392)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
at org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:555)
at org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:410)
at org.eclipse.jetty.io.ssl.SslConnection$2.succeeded(SslConnection.java:164)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)
at java.base/java.lang.Thread.run(Thread.java:829)
16 Sep 2022 15:27:42 [jetty-59] [door:WebDAV-fndcaitb4-1@webdavDomain:AAXo0TGqBGA] DELETE :: fndcaitb4.fnal.gov:2880///mu2e/scratch/users/litvinse/junk.data finished 36ms, Status:HTTP/1.1 500, Length:23
More details:
- get token:
[litvinse@fnisd1 scitokens]$ htgettoken -a fermicloud543.fnal.gov -i mu2e
Attempting to get token from https://fermicloud543.fnal.gov:8200 ... failed
Attempting kerberos auth with https://fermicloud543.fnal.gov:8200 ... succeeded
Storing vault token in /tmp/vt_u8637
Attempting to get token from https://fermicloud543.fnal.gov:8200 ... succeeded
Storing bearer token in /run/user/8637/bt_u8637
[litvinse@fnisd1 scitokens]$ spath=`httokendecode | jq ".scope" | sed -e "s/\"//g" | awk '{ print $NF}' | cut -d ":" -f2`
[litvinse@fnisd1 scitokens]$ echo $spath
/mu2e/scratch/users/litvinse
[litvinse@fnisd1 scitokens]$ httokendecode
{
"wlcg.ver": "1.0",
"aud": "https://wlcg.cern.ch/jwt/v1/any",
"sub": "408c3486-b1ce-4d3d-ab3f-74717f80a534",
"nbf": 1663358665,
"scope": "storage.read:/mu2e compute.create compute.read storage.create:/mu2e/scratch/users/litvinse compute.cancel compute.modify storage.modify:/mu2e/scratch/users/litvinse",
"iss": "https://cilogon.org/mu2e",
"exp": 1663369470,
"iat": 1663358670,
"wlcg.groups": [
"/mu2e"
],
"jti": "https://cilogon.org/oauth2/3cbad440f5235fc44c38e465d3f86c6e?type=accessToken&ts=1663358669990&version=v2.0&lifetime=10800000"
}
- Upload a file:
[litvinse@fnisd1 scitokens]$ curl -f -k -L -H "Authorization: Bearer ${TOKEN}" -Tjunk https://fndcaitb4.fnal.gov:2880${spath}/junk.data
curl: (22) NSS: client certificate not found (nickname not specified)
[litvinse@fnisd1 scitokens]$ TOKEN=`cat /run/user/8637/bt_u8637`
[litvinse@fnisd1 scitokens]$ curl -f -k -L -H "Authorization: Bearer ${TOKEN}" -Tjunk https://fndcaitb4.fnal.gov:2880${spath}/junk.data
837787900 bytes uploaded
- check that it now exists:
[root@fndcaitb2 litvinse]# ls -altr
total 3272611
drwxrwxr-x 100 17840 9914 512 Aug 16 15:09 ..
drwxrwxr-x 6 litvinse 9914 512 Sep 16 15:26 .
-rw-rw-r-- 1 litvinse 9914 837787900 Sep 16 15:26 junk.data
- try to remove
[litvinse@fnisd1 scitokens]$ curl -f -k -L -H "Authorization: Bearer ${TOKEN}" -X DELETE https://fndcaitb4.fnal.gov:2880${spath}/junk.data
curl: (22) NSS: client certificate not found (nickname not specified)
- observe that file does not exist:
[root@fndcaitb2 litvinse]# ls -altr
total 2454459
drwxrwxr-x 100 17840 9914 512 Aug 16 15:09 ..
drwxrwxr-x 5 litvinse 9914 512 Sep 16 15:27 .
- using gfal client fails as well:
[litvinse@fnisd1 scitokens]$ export BEARER_TOKEN=`cat /run/user/8637/bt_u8637`
[litvinse@fnisd1 scitokens]$ gfal-copy -K adler32 junk https://fndcaitb4.fnal.gov:2880${spath}/junk.data
Copying file:///home/litvinse/scitokens/junk [DONE] after 33s
[litvinse@fnisd1 scitokens]$ gfal-rm https://fndcaitb4.fnal.gov:2880${spath}/junk.data
https://fndcaitb4.fnal.gov:2880/mu2e/scratch/users/litvinse/junk.data FAILED
gfal-rm error: 5 (Input/output error) - DavPosix::unlink HTTP 500 : Unexpected server error: 500 with url https://fndcaitb4.fnal.gov:2880/mu2e/scratch/users/litvinse/junk.data
[litvinse@fnisd1 scitokens]$
There is NPE on sever side in both cases. File is ultimately removed but as you can see client sees a fail.
As stack trace indicates this seems to have to do with Kafka. So I did:
webdav.enable.kafka=false
and I see no NPE and delete succeeds.
@mksahakyan may be you can take a look?
@paulmillar and @mksahakyan I have looked at the code in
DoorRequestMessageSerializer.java
And I modified it like so:
JSONArray subject = new JSONArray();
// data.getSubject().getPrincipals().forEach(s -> subject.put(s));
data.getSubject().getPrincipals()
.stream()
.filter(s -> !(s instanceof JwtJtiPrincipal))
.forEach(subject::put);
so that I am filtering out JwtJtiPrincipal
.... And I get no NPE. All works as expected.
So, something in that Principal leads to NPE.
Principal: JwtJtiPrincipal[https://cilogon.org/oauth2/3cbad440f5235fc44c38e465d3f86c6e?type=accessToken&ts=1663358669990&version=v2.0&lifetime=10800000@mu2e]
As I mentioned in #6722, the line numbers in the stack-trace simply don't match what dCache code is doing: not in the 7.2
branch and not in master
branch. The line numbers match (more or less) how the method worked in a much older (and unsupported) version of dCache.
The only explanation that I can think of (with the evidence so far) is that you're not running the version of dCache you think you are.
There are (at least) two possible explanations:
- you have a rogue
dcache-core.jar
file installed in the/usr/share/dcache/classes
,/usr/share/dcache/plugins
or/usr/local/share/dcache/plugins
directory, which dCache is using instead of thedcache-core.jar
from the RPM. - you have installed a plugin that (inadvertently) contains classes (such as
DoorRequestMessageSerializer
) from some ancient version of dCache.
Paul,
I build RPM from master
mvn clean package -am -pl packages/fhs -P rpm
and I deployed it using
rpm -Uvh --force ...
I have been building and deploying dcache this way for many years - I do not remember anything along you suggested fault lines happening.
Please note, that the stack trace looks different from #6772
I just re-installed having removed everything in classes directory:
2405 2022-09-23 08:54:26 cd /usr/share/dcache/classes/
2406 2022-09-23 08:54:27 ls -al
2407 2022-09-23 08:54:35 dcache stop
2408 2022-09-23 08:55:27 rm -rf *
2409 2022-09-23 08:55:37 cd /opt/
2413 2022-09-23 08:56:35 rpm -Uvh -force dcache-8.2.0.7d79a22-1.noarch.rpm
I get the same stacktrace on remove:
23 Sep 2022 08:59:45 (WebDAV-fndcaitb4-1) [door:WebDAV-fndcaitb4-1@webdavDomain:AAXpWJcZTYg] Internal server error
java.lang.NullPointerException: null
at org.dcache.notification.DoorRequestMessageSerializer.serialize(DoorRequestMessageSerializer.java:77)
at org.dcache.notification.DoorRequestMessageSerializer.serialize(DoorRequestMessageSerializer.java:18)
at org.apache.kafka.common.serialization.Serializer.serialize(Serializer.java:62)
at org.apache.kafka.clients.producer.KafkaProducer.doSend(KafkaProducer.java:945)
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:905)
at org.springframework.kafka.core.DefaultKafkaProducerFactory$CloseSafeProducer.send(DefaultKafkaProducerFactory.java:993)
at org.springframework.kafka.core.KafkaTemplate.doSend(KafkaTemplate.java:649)
at org.springframework.kafka.core.KafkaTemplate.send(KafkaTemplate.java:403)
at org.springframework.kafka.core.KafkaTemplate.sendDefault(KafkaTemplate.java:382)
at org.dcache.webdav.DcacheResourceFactory.sendRemoveInfoToBilling(DcacheResourceFactory.java:1102)
at org.dcache.webdav.DcacheResourceFactory.deleteFile(DcacheResourceFactory.java:1087)
at org.dcache.webdav.DcacheFileResource.delete(DcacheFileResource.java:190)
at io.milton.http.DeleteHelperImpl.delete(DeleteHelperImpl.java:126)
at io.milton.http.http11.DeleteHandler.processExistingResource(DeleteHandler.java:98)
at io.milton.http.ResourceHandlerHelper.processResource(ResourceHandlerHelper.java:196)
at io.milton.http.ResourceHandlerHelper.processResource(ResourceHandlerHelper.java:131)
at io.milton.http.http11.DeleteHandler.processResource(DeleteHandler.java:83)
at io.milton.http.ResourceHandlerHelper.process(ResourceHandlerHelper.java:127)
at org.dcache.webdav.DcacheResourceHandlerHelper.process(DcacheResourceHandlerHelper.java:42)
at io.milton.http.http11.DeleteHandler.process(DeleteHandler.java:72)
at org.dcache.webdav.DcacheStandardFilter.process(DcacheStandardFilter.java:50)
at io.milton.http.FilterChain.process(FilterChain.java:46)
at org.dcache.webdav.transfer.CopyFilter.process(CopyFilter.java:276)
at io.milton.http.FilterChain.process(FilterChain.java:46)
at io.milton.http.HttpManager.process(HttpManager.java:158)
at org.dcache.webdav.MiltonHandler.handle(MiltonHandler.java:77)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:59)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.dcache.http.AuthenticationHandler.access$001(AuthenticationHandler.java:55)
at org.dcache.http.AuthenticationHandler.lambda$handle$0(AuthenticationHandler.java:156)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at java.base/javax.security.auth.Subject.doAs(Subject.java:361)
at org.dcache.http.AuthenticationHandler.handle(AuthenticationHandler.java:153)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:59)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.dcache.http.AbstractLoggingHandler.handle(AbstractLoggingHandler.java:110)
at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:59)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:322)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.Server.handle(Server.java:516)
at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
at org.eclipse.jetty.io.ssl.SslConnection$DecryptedEndPoint.onFillable(SslConnection.java:555)
at org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:410)
at org.eclipse.jetty.io.ssl.SslConnection$2.succeeded(SslConnection.java:164)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:386)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)
at java.base/java.lang.Thread.run(Thread.java:829)
Hi Dmitry,
Could you try downloading one of the 7.2 RPMs from the dcache.org site, install it, and see if you can reproduce this problem?
Cheers, Paul.
the error is coming from JSONObject class
public String toString() {
try {
return this.toString(0);
} catch (Exception var2) {
return null;
}
}
i am trying to debug is now to get the exception
The patch fixing the issue is on the RBT
@mksahakyan Where does this JSONObject
code come from?
JSONObject (org.json) is comming from line 77 in DoorRequestMessageSerializer
return o.toString().getBytes(UTF_8); then at some point in JSONObject on line 185 happens the following:
static final Writer writeValue(Writer writer, Object value, int indentFactor, int indent) throws JSONException, IOException { > if (value != null && !value.equals((Object)null)) {
and then is OpScopedPrincipal 's equal() is called.
It is worth noticing that, "org.json" version is old 2014
Sorry, I was a little unclear.
You posted a code snippet; specifically,
public String toString() {
try {
return this.toString(0);
} catch (Exception var2) {
return null;
}
}
It isn't dCache code, so from which jar file does this code come?
from the jar json-20141113.jar
<dependency>
<groupId>org.json</groupId>
<artifactId>json</artifactId>
<version>20141113</version>
</dependency>
https://github.com/stleary/JSON-java/blob/master/src/main/java/org/json/JSONObject.java#L2365
Ta:
https://github.com/stleary/JSON-java/pull/690
What I still don't understand is why the stack-trace contains this entry:
at org.dcache.notification.DoorRequestMessageSerializer.serialize(DoorRequestMessageSerializer.java:18)
The method serialize
isn't defined at line 18, it's the class definition.
Here is the link to RB fix from @mksahakyan
Patch: https://rb.dcache.org/r/13690/
Thank you @mksahakyan for fixing this issue