Tika 2.x incompatible with Quarkus Tika
Camel has moved on to Tika 2.x, Quarkus Tika is currently aligned with 1.x.
Seems most functionality actually works when I exclude the org.apache.tika dependencies from org.apache.camel:camel-tika, except for the textMain content type:
Caused by: java.lang.NoClassDefFoundError: org/apache/tika/sax/boilerpipe/BoilerpipeContentHandler
at org.apache.camel.component.tika.TikaProducer.getContentHandler(TikaProducer.java:154)
at org.apache.camel.component.tika.TikaProducer.doParse(TikaProducer.java:117)
at org.apache.camel.component.tika.TikaProducer.process(TikaProducer.java:91)
at org.apache.camel.support.AsyncProcessorConverterHelper$ProcessorToAsyncProcessorBridge.process(AsyncProcessorConverterHelper.java:66)
... 40 more
For native mode it's possible to fix and work around this via a substitution. For JVM mode it's a bit trickier to fix.
I have a workaround in place until things are aligned better.
See also https://github.com/quarkiverse/quarkus-tika/issues/22.
TikaProducerSubstitutions.java is missing license header.
@jamesnetherton shoud it be closed?
@jamesnetherton shoud it be closed?
I was leaving it open because the proper fix is to upgrade to a Quarkus Tika release that is aligned with Tika 2.x.
OK, I get it.
@jamesnetherton quarkus-tika upgrade to 2.0.0.CR1 which depends on tika 2.7.0 now. So I think we could revert this workaround and close this issue?
This was fixed in 3.0.0-M2 by https://github.com/apache/camel-quarkus/pull/4739.