phase4 icon indicating copy to clipboard operation
phase4 copied to clipboard

Problem with OCSP certificate revocation check in Java 17u4 and later

Open Florianisme opened this issue 2 years ago • 49 comments

Hi Philip, I'm currently encountering an issue while testing with the new 2.1.0 release. I'm using Spring Boot 3.

When I start the Access Point and send a message to customer A, the transmission is successful. While the application is still running, I send a message to customer B. The transmission fails with the following error:

com.helger.phase4.peppol.Phase4PeppolException: The configured receiver AP certificate is not valid (at 2023-05-05T09:28:22.546313412+02:00) and cannot be used for sending. Aborting. Reason: certificate is revoked at com.helger.phase4.peppol.Phase4PeppolSender._checkReceiverAPCert(Phase4PeppolSender.java:280 undefined) ~[phase4-peppol-client-2.1.0.jar!/:2.1.0] at com.helger.phase4.peppol.Phase4PeppolSender$AbstractPeppolUserMessageBuilder.finishFields(Phase4PeppolSender.java:652 undefined) ~[phase4-peppol-client-2.1.0.jar!/:2.1.0] at com.helger.phase4.peppol.Phase4PeppolSender$Builder.finishFields(Phase4PeppolSender.java:1061 undefined) ~[phase4-peppol-client-2.1.0.jar!/:2.1.0] at com.helger.phase4.sender.AbstractAS4MessageBuilder.sendMessage(AbstractAS4MessageBuilder.java:605 undefined) ~[phase4-lib-2.1.0.jar!/:2.1.0]

If however I restart the application and send a message to customer B first, it succeeds. When I try to send a message to customer A then, it fails with the same exception.

The transmission keeps failing until I restart the application. Let me know if you need additional logs. I skimmed through the code of phase4 and peppol-commons but did not find any obvious error yet.

Florianisme avatar May 05 '23 07:05 Florianisme

@Florianisme this is indeed weird. Can you please turn on debug logging and post what happens before? Or do you see anything weird regarding to certificate validation on the first message sending?

phax avatar May 05 '23 10:05 phax

I'm running with Debug enabled all the time in our test instance. I have attached a log from the complete SMP lokup. AP_certificate_revoked.txt Note that I'm running in Test mode, so it's using the SMK.

I sadly can't see anything wrong with the initial validation. I looks like it could be an issue witht the revocation cache, did anything significantly change since the last 1.4.x release?

Florianisme avatar May 05 '23 12:05 Florianisme

I see, I need to improve debug logging when it comes to certificate checking. There were no significant changes in my code from 1.4.x to 2.x, but due to the Java version change from 1.8 to 11 there might have been a change there.

Anyhow, this seems to be the debug log of the second request that fails - most likely because the cached result is returned.

Can you please provide a similar debug log for the first transmission request? Thanks

phax avatar May 05 '23 12:05 phax

Existing debug log messages should already be emitted. Search e.g. for Performing certificate revocation check on certificate

phax avatar May 05 '23 12:05 phax

Will add a log, which contains the initial fail

I does print that in the successful log AP_certificate_ok.txt. See that I'm querying the same Peppol Receiver ID.. :)

I actually had to switch to Java 17 because Spring 3.0 requires it. I'm also running behind a proxy if that is of any interest. Though I have set the properties http.proxyHost, http.proxy.host (for ports and https respectively)

Florianisme avatar May 05 '23 12:05 Florianisme

Yes, that looks pretty okay.

As a workaround please call PeppolCertificateChecker.setCacheOCSPResults(false) to disable OCSP result caching in general

phax avatar May 05 '23 12:05 phax

That does seem to work for now. Though I could not reproduce the strange behavior I described initally before I made the change. I'll do some further testing next week. Thanks!

Florianisme avatar May 05 '23 13:05 Florianisme

I have enabled the java.security.debug=certpath and javax.net.debug=all properties and this is what I found in the log: certpath: connecting to OCSP service at: http://pki-ocsp.symauth.com certpath: RevocationChecker.check() Unable to determine revocation status due to network error

So I'm guessing nothing you can do :) Sorry to have bothered you!

Florianisme avatar May 08 '23 18:05 Florianisme

Thanks for digging deep into this issue. So it really seems to be an issue with the outbound proxy on your side....

https://bugs.openjdk.org/browse/JDK-8132011 may be a good hint to see what is necessary to enable a Proxy in OCSP - it seems like this can only be achieved via the global system properties.

phax avatar May 08 '23 18:05 phax

I am doing that already, though I might have to dig even deeper now. Did your library use CRL instead of OCSP before by any means?

I don't get why it was working perfectly fine before..

Florianisme avatar May 08 '23 18:05 Florianisme

I am currently only using OCSP and not CRL. Maybe some inhouse changes???

phax avatar May 08 '23 18:05 phax

I'm guessing it's related to our network. Some OCSP requests succeed so my implementaion can't be that wrong.. :)

Florianisme avatar May 08 '23 19:05 Florianisme

I can confirm, that the OSCP lookup has some issues when using with Java 17 - it really seems to be quite indeterministic when it works and when it doesn't work.

phax avatar May 16 '23 15:05 phax

That's interesting. I was fully blaming our infrastructure for the intermittent issues. I dug through the Java implementation of the OCSP check but could not find a reason why it would fail. I sadly also didn't find anything while googling yet.

The thing I noticed was that when it fails, it fails quickly (like tens of milliseconds quickly).

Florianisme avatar May 16 '23 15:05 Florianisme

I enabled the system property -Djava.security.debug=certpath and in case of error, this is the debug trace I see:

certpath: ---checking signature...
certpath: signature verified.
certpath: BasicChecker.updateState issuer: CN=PEPPOL ACCESS POINT TEST CA - G2, OU=FOR TEST ONLY, O=OpenPEPPOL AISBL, C=BE; subject: CN=PJP000572, OU=PEPPOL TEST AP, O="Mirai Communication Network, Inc.", C=JP; serial#: 86370016705632901727484386645577705200
certpath: RevocationChecker.check: checking cert
  SN:     40fa4298 ff6ca832 2c974987 bdb626f0
  Subject: CN=PJP000572, OU=PEPPOL TEST AP, O="Mirai Communication Network, Inc.", C=JP
  Issuer: CN=PEPPOL ACCESS POINT TEST CA - G2, OU=FOR TEST ONLY, O=OpenPEPPOL AISBL, C=BE
certpath: connecting to OCSP service at: http://pki-ocsp.symauth.com
certpath: RevocationChecker.check() Unable to determine revocation status due to network error
certpath: RevocationChecker.check() preparing to failover

A similar positive case looks like this:

certpath: -Using checker7 ... [sun.security.provider.certpath.RevocationChecker]
certpath: RevocationChecker.check: checking cert
  SN:     39f4fc3d 66ab97be 24e10f47 520ee7e7
  Subject: CN=POP000270, OU=PEPPOL PRODUCTION AP, O=BRZ GmbH, C=AT
  Issuer: CN=PEPPOL ACCESS POINT CA - G2, O=OpenPEPPOL AISBL, C=BE
certpath: connecting to OCSP service at: http://pki-ocsp.symauth.com
certpath: OCSP response status: SUCCESSFUL
certpath: OCSP response type: basic
certpath: Responder ID: byKey: 7283244128DE19362F3F88EBD4F0EAAEB366A0AF
certpath: OCSP response produced at: Tue May 16 12:30:17 UTC 2023
certpath: OCSP number of SingleResponses: 1
certpath: thisUpdate: Tue May 16 12:30:17 UTC 2023
certpath: nextUpdate: Tue May 23 12:30:17 UTC 2023
certpath: OCSP response cert #1: CN=PEPPOL ACCESS POINT CA - G2 OCSP Responder 2023-03-06T17:41:29Z, O=OpenPEPPOL AISBL, C=BE

on my local machine (Java 17.0.4.101-hotspot Win64) I can boil the error down to this exception:

certpath: connecting to OCSP service at: http://pki-ocsp.symauth.com
java.security.cert.CertPathValidatorException: Unable to determine revocation status due to network error
	at java.base/sun.security.provider.certpath.OCSP.check(OCSP.java:202)
	at java.base/sun.security.provider.certpath.RevocationChecker.checkOCSP(RevocationChecker.java:785)
	at java.base/sun.security.provider.certpath.RevocationChecker.check(RevocationChecker.java:369)
	at java.base/sun.security.provider.certpath.RevocationChecker.check(RevocationChecker.java:343)
	at java.base/sun.security.provider.certpath.SunCertPathBuilder.depthFirstSearchForward(SunCertPathBuilder.java:426)
	at java.base/sun.security.provider.certpath.SunCertPathBuilder.depthFirstSearchForward(SunCertPathBuilder.java:528)
	at java.base/sun.security.provider.certpath.SunCertPathBuilder.buildForward(SunCertPathBuilder.java:225)
	at java.base/sun.security.provider.certpath.SunCertPathBuilder.buildCertPath(SunCertPathBuilder.java:160)
	at java.base/sun.security.provider.certpath.SunCertPathBuilder.build(SunCertPathBuilder.java:131)
	at java.base/sun.security.provider.certpath.SunCertPathBuilder.engineBuild(SunCertPathBuilder.java:126)
	at java.base/java.security.cert.CertPathBuilder.build(CertPathBuilder.java:297)
	at com.helger.peppol.utils.CertificateRevocationChecker$AbstractRevocationCheckBuilder.lambda$build$1(CertificateRevocationChecker.java:659)
	at com.helger.peppol.utils.CertificateRevocationChecker$AbstractRevocationCheckBuilder.build(CertificateRevocationChecker.java:683)
	at com.helger.peppol.utils.PeppolCertificateChecker.checkCertificate(PeppolCertificateChecker.java:443)
	at com.helger.peppol.utils.PeppolCertificateChecker.checkPeppolAPCertificate(PeppolCertificateChecker.java:474)
	at com.helger.peppol.pub.PagePublicToolsParticipantInformation._queryParticipant(PagePublicToolsParticipantInformation.java:1003)
	at com.helger.peppol.pub.PagePublicToolsParticipantInformation.fillContent(PagePublicToolsParticipantInformation.java:1280)
	at com.helger.peppol.pub.PagePublicToolsParticipantInformation.fillContent(PagePublicToolsParticipantInformation.java:1)
	at com.helger.photon.uicore.page.AbstractWebPage.getContent(AbstractWebPage.java:162)
	at com.helger.photon.bootstrap4.uictrls.ext.BootstrapPageRenderer.getPageContent(BootstrapPageRenderer.java:133)
	at com.helger.photon.bootstrap4.uictrls.ext.BootstrapPageRenderer.getPageContent(BootstrapPageRenderer.java:160)
	at com.helger.peppol.pub.LayoutAreaContentProviderPublic.getPageContent(LayoutAreaContentProviderPublic.java:236)
	at com.helger.peppol.pub.LayoutAreaContentProviderPublic.getContent(LayoutAreaContentProviderPublic.java:299)
	at com.helger.peppol.ui.AppLayoutHTMLProvider.fillBody(AppLayoutHTMLProvider.java:68)
	at com.helger.photon.core.html.AbstractSWECHTMLProvider.fillHeadAndBody(AbstractSWECHTMLProvider.java:106)
	at com.helger.photon.core.html.AbstractHTMLProvider.createHTML(AbstractHTMLProvider.java:164)
	at com.helger.photon.app.html.PhotonHTMLHelper.createHTMLResponse(PhotonHTMLHelper.java:117)
	at com.helger.photon.core.servlet.AbstractApplicationXServletHandler.handleRequest(AbstractApplicationXServletHandler.java:102)
	at com.helger.xservlet.handler.simple.XServletHandlerToSimpleHandler.onRequest(XServletHandlerToSimpleHandler.java:238)
	at com.helger.xservlet.AbstractXServlet._invokeHandler(AbstractXServlet.java:354)
	at com.helger.xservlet.AbstractXServlet.service(AbstractXServlet.java:539)
	at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:587)
	at com.helger.xservlet.AbstractXServlet.service(AbstractXServlet.java:595)
	at org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1410)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:764)
	at org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1665)
	at com.helger.servlet.filter.CharacterEncodingFilter.doHttpFilter(CharacterEncodingFilter.java:184)
	at com.helger.servlet.filter.AbstractHttpServletFilter.doFilter(AbstractHttpServletFilter.java:66)
	at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:202)
	at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:527)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:131)
	at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:578)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1570)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1380)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:484)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1543)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:174)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1302)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129)
	at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:51)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
	at org.eclipse.jetty.server.Server.handle(Server.java:563)
	at org.eclipse.jetty.server.HttpChannel.lambda$handle$0(HttpChannel.java:505)
	at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:762)
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:497)
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:282)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
	at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:416)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:385)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:272)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.lambda$new$0(AdaptiveExecutionStrategy.java:140)
	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:934)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1078)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.io.EOFException
	at java.base/sun.security.util.IOUtils.readExactlyNBytes(IOUtils.java:61)
	at java.base/sun.security.provider.certpath.OCSP.getOCSPBytes(OCSP.java:282)
	at java.base/sun.security.provider.certpath.OCSP.check(OCSP.java:195)
	... 71 more

So eventually we do need to take into consideration, that the OSCP Service at http://pki-ocsp.symauth.com has issues????

phax avatar May 16 '23 19:05 phax

Ha, here we have the source of the error: the expected length of 0x7fffffff is a bit too much: grafik The actual data read has 2277 bytes.

The expected length is the "content length" which is missing and therefore set to Integer.MAX_VALUE.

grafik

Indeed, no Content-Length header is returned:

grafik

phax avatar May 16 '23 20:05 phax

Comparing it to a successful response, the Content-Length header is contained and has the value of 2277:

grafik

So it is indeed an error at the OSCP service of DigiCert, that they are not consistently returning the Content-Length header would now be my conclusion.

phax avatar May 16 '23 20:05 phax

The reading code in Java 11.0.16 looks like this:

grafik

It deals with the "no Content-Length" situation totally different. I think I file a JDK bug

phax avatar May 16 '23 20:05 phax

The change was introduced from 17b3 to 17b4:

  • Version 17b3:

https://github.com/openjdk/jdk/blob/jdk-17%2B3/src/java.base/share/classes/sun/security/provider/certpath/OCSP.java#L262-L274

  • Version 17b4:

https://github.com/openjdk/jdk/blob/jdk-17%2B4/src/java.base/share/classes/sun/security/provider/certpath/OCSP.java#L271-L277

phax avatar May 16 '23 20:05 phax

The commit that changed it was https://github.com/openjdk/jdk/commit/f5ee356540d7aa4a7663c0d5d74f5fdb0726b426 as a result for https://bugs.openjdk.org/browse/JDK-8179503

phax avatar May 16 '23 20:05 phax

I filed a bug in the Java bug database. Internal ID 9075275 - under review.

phax avatar May 16 '23 20:05 phax

Wow, nice investigation! I don't know why but my stacktraces were cut off before the interesting stuff.. Let's see what will come out of this.

Florianisme avatar May 16 '23 21:05 Florianisme

(I created the Stack traces from within the debugger with "evaluate expression")

phax avatar May 16 '23 21:05 phax

The server behaves correctly IMHO, as there is either a Content-Length or a Transfer-Encoding header (presumeably "chunked"), so I guess we have to wait for a JDK update here. Thanks for the analysis, followed with interest :)

nbredenbals avatar May 17 '23 05:05 nbredenbals

The bug has been accepted and you can follow under https://bugs.java.com/bugdatabase/view_bug?bug_id=JDK-8308255

phax avatar May 17 '23 06:05 phax

PR for the fix is already opened https://github.com/openjdk/jdk17u-dev/pull/1361

Florianisme avatar May 22 '23 12:05 Florianisme

The fix is supposed to be in Java 17.0.8 which is scheduled to be released 2023-07-18 according to https://www.java.com/releases/

phax avatar Jun 18 '23 12:06 phax

Do you know when the Docker images will be updated? I did not find a schedule for any of the more popular JRE images

Florianisme avatar Jun 18 '23 13:06 Florianisme

No idea - sorry. I think that is vendor dependent but will happen asap (I hope ;-) )

phax avatar Jun 18 '23 13:06 phax

So the Adoptium update 17.0.8 is out. I am closing this issue now.

phax avatar Aug 01 '23 10:08 phax