datashare
datashare copied to clipboard
upgrade tika to the 2.4.1 release
Is your feature request related to a problem? Please describe.
No.
Describe the solution you'd like
Datashare/extract is having a dependency on Tika 1.22 (released 1st of august 2019). Since then there has been 4 releases, the latest is 1.26 and there is a 2.0.0-alpha.
For now it is breaking the indexing features with java.lang.NoSuchMethodError (see ICIJ/extract@55ff0cc7bc5acc570839849e149cc039fb5c45ad )
It is necessary to check all dependencies from tika that are specified in the pom.xml (and with datashare transitive dependencies).
The root cause for the NotSuchMethodError seemed to be commons-codec that needed to be upgraded from 1.10 to 1.13. But after having done it we still saw the error.
this may be related to https://issues.liferay.com/browse/LPS-120596
Upgrading Tika early and often is a good idea. Let me know if you want to chat about migrating to >= 2.1.0.
@tballison thanks for your message. I'm digging into it. what do you think is the best :
- progressive upgrade 1.24/1.26/ 2.0 ...
- going straight to 2.1 and solving problems 1 by 1 ?
- other strategy ?
If you have time, I'd recommend going straight to 2.4.1. There aren't that many diffs/changes within 2.x. This is the documentation we've put together: https://cwiki.apache.org/confluence/display/TIKA/Migrating+to+Tika+2.0.0
The 2.5.0 release should happen in the next few weeks, but that should be a drop in replacement for 2.4.1.
Let me know if you have any questions on 2.x!
This issue is stale because it has been open for 40 days with no activity.