log4j-detector
log4j-detector copied to clipboard
java.lang.IllegalArgumentException: malformed input off : 4, length : 1 at java.base/java.lang.StringCoding.throwMalformed(StringCoding.java:698)
We tried the Scanner on a Multi-Archive-Tar file that contained a few .jar-Files and got the Message
-- Problem: XX/log4jtest.tar - java.lang.IllegalArgumentException: malformed input off : 4, length : 1
java.lang.IllegalArgumentException: malformed input off : 4, length : 1
at java.base/java.lang.StringCoding.throwMalformed(StringCoding.java:698)
at java.base/java.lang.StringCoding.decodeUTF8_0(StringCoding.java:885)
at java.base/java.lang.StringCoding.newStringUTF8NoRepl(StringCoding.java:978)
at java.base/java.lang.System$2.newStringUTF8NoRepl(System.java:2270)
at java.base/java.util.zip.ZipCoder$UTF8.toString(ZipCoder.java:60)
at java.base/java.util.zip.ZipCoder.toString(ZipCoder.java:87)
at java.base/java.util.zip.ZipInputStream.readLOC(ZipInputStream.java:302)
at java.base/java.util.zip.ZipInputStream.getNextEntry(ZipInputStream.java:124)
at com.mergebase.log4j.Log4JDetector.findLog4jRecursive(Log4JDetector.java:208)
at com.mergebase.log4j.Log4JDetector.scan(Log4JDetector.java:442)
at com.mergebase.log4j.Log4JDetector.analyze(Log4JDetector.java:502)
at com.mergebase.log4j.Log4JDetector.analyze(Log4JDetector.java:497)
at com.mergebase.log4j.Log4JDetector.main(Log4JDetector.java:84)
The TAR-File itself seems to correct.
Is it possible that there might be problems involving "Multi-Archive"-Archives with perhapes NON UTF-8-encoded Sub-Archives ?
Just a Shot in the Dark ...
It also spits out a couple of java.lang.IllegalArgumentException: MALFORMED and java.io.EOFException at me, for some libjli.so and libzip.so modules and the latter for jexec in an old JRE directory as probably reported by @volker-graf
Latest version probably won't have these errors because it now ignores everything that isn't a zip/ear/jar/war/aar file (with those suffixes). Would that work for you? Or do you think the log4j-detector should enter *.tar files?
(Entering *.tar.gz / *.tar.xz / *.tar.bz2 starts to be a pain since those require temporary disk space, whereas current approach that only enters zip files can do everything in-memory).
Dear Julius, I have tried it again with 2021-12-16 and it indeed skips tar balls. I only got an Out Of Memory error now after some time, probably because Multipart ZIP files and Self-Extracting Shell ZIP files can not be detected / analyzed succesfully. But the static object modules and the jexec are not reported any more. Thanks for that, it works for me. Dunno about @volker-graf being the Original Poster. Kind regards, Stefan
I got a few "Out Of Memory"-errors but I fixed them by adding -Xmx8G to the cmd-line-arguments.