openwayback icon indicating copy to clipboard operation
openwayback copied to clipboard

NoClassDefFoundError raised for ZipNumClusterSearchResultSource; we are not using ZipNum

Open ptrourke opened this issue 4 years ago • 7 comments

We've been seeing a NoClassDefFoundError raised when we have an exception on opening a CDX file. I suspect the CDX file error is related to https://github.com/iipc/openwayback/issues/276 (this instance has 75 unique CDX files, I think, and they are quite large; I'm hoping other work we have planned will resolve this underlying issue), but I'm asking specifically about the NoClassDefFoundError.

It references ZipNumClusterSearchResultSource. Now, I know we're not using ZipNum; but I'm wondering if we need a bean definition referencing ZipNumClusterSearchResults to indicate that we are not using it to avoid seeing this NoClassDefFoundError?

java.io.FileNotFoundException: /wa_idx/production/contract_crawls/pre2009_priority_iraq.cdx (Too many open files)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:241)
at org.archive.wayback.util.flatfile.FlatFile.getRecordIterator(FlatFile.java:202)
at org.archive.wayback.resourceindex.cdx.CDXIndex.getPrefixIterator(CDXIndex.java:61)
at org.archive.wayback.resourceindex.CompositeSearchResultSource.getPrefixIterator(CompositeSearchResultSource.java:78)
at org.archive.wayback.resourceindex.LocalResourceIndex.doCaptureQuery(LocalResourceIndex.java:210)
at org.archive.wayback.resourceindex.LocalResourceIndex.query(LocalResourceIndex.java:326)
[snip stack trace]
at java.lang.Thread.run(Thread.java:745)
Jul 27, 2020 9:27:52 AM org.apache.catalina.core.StandardWrapperValve invoke
SEVERE: Servlet.service() for servlet default threw exception
java.lang.NoClassDefFoundError: org/archive/wayback/resourceindex/ZipNumClusterSearchResultSource
at org.archive.wayback.resourceindex.CompositeSearchResultSource.getPrefixIterator(CompositeSearchResultSource.java:81)
at org.archive.wayback.resourceindex.LocalResourceIndex.doCaptureQuery(LocalResourceIndex.java:210)
at org.archive.wayback.resourceindex.LocalResourceIndex.query(LocalResourceIndex.java:326)
at org.archive.wayback.webapp.AccessPoint.queryIndex(AccessPoint.java:598)
at org.archive.wayback.webapp.AccessPoint.searchCaptures(AccessPoint.java:1016)
at org.archive.wayback.webapp.AccessPoint.handleReplay(AccessPoint.java:772)
at org.archive.wayback.webapp.AccessPoint.handleRequest(AccessPoint.java:314)
at org.archive.wayback.util.webapp.RequestMapper.handleRequest(RequestMapper.java:198)
at org.archive.wayback.util.webapp.RequestFilter.doFilter(RequestFilter.java:146)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:745)

Any suggestions would be welcome! Thank you!

ptrourke avatar Aug 04 '20 21:08 ptrourke

Hey Patrick! Thanks for bringing up this error. ~~Could it be that CompositeSearchResultSource.java is missing an import? If you add to that file:~~

~~import org.archive.wayback.resourceindex.ZipNumClusterSearchResultSource;~~

~~Does it fix the issue?~~

Never mind that is in the same package; doesn't need explicit import.

ldko avatar Aug 04 '20 21:08 ldko

Hmm. I better see if I can find the build for it, then, because I thought this was just the standard binary at http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.netpreserve.openwayback%22 . I'll look at the documentation here.

ptrourke avatar Aug 04 '20 22:08 ptrourke

Well, the source I found in a build environment does lack that import. I'll circle back! Thank you, that should have been an obvious thing for me to check.

ptrourke avatar Aug 04 '20 22:08 ptrourke

Ah, I see, no, it's not a custom build after all:

https://github.com/iipc/openwayback/blob/master/wayback-core/src/main/java/org/archive/wayback/resourceindex/CompositeSearchResultSource.java

Can you confirm that the import is missing from the java source file in the repo?

ptrourke avatar Aug 05 '20 12:08 ptrourke

Yeah, I can confirm the import isn't in the repo, and isn't needed. I was not able to trigger the java.lang.NoClassDefFoundError: org/archive/wayback/resourceindex/ZipNumClusterSearchResultSource by configuring OpenWayback to use CompositeSearchResultSource with cdx files that don't exist to raise the ResourceIndexNotAvailableException. I had added logging to verify that I was in the catch (ResourceIndexNotAvailableException e) block here.

I wonder if the "Too many open files" issue can cause a java.lang.NoClassDefFoundError?

ldko avatar Aug 05 '20 16:08 ldko

@ldko yes, that is possible - if the ClassLoader has to open a new file to find the class in question, it'll fail if there are too many file handles open already. IMO I would update the nofiles ulimit and see if this error comes up again.

anjackson avatar Aug 05 '20 19:08 anjackson

We're trying with a 2^16 nodules ulimit to see if that helps. Thank you!

ptrourke avatar Aug 27 '20 17:08 ptrourke