log4j-detector icon indicating copy to clipboard operation
log4j-detector copied to clipboard

java.lang.IllegalArgumentException: malformed input off : 4, length : 1 at java.base/java.lang.StringCoding.throwMalformed(StringCoding.java:698)

Open volker-graf opened this issue 3 years ago • 4 comments

We tried the Scanner on a Multi-Archive-Tar file that contained a few .jar-Files and got the Message

-- Problem: XX/log4jtest.tar - java.lang.IllegalArgumentException: malformed input off : 4, length : 1
java.lang.IllegalArgumentException: malformed input off : 4, length : 1
        at java.base/java.lang.StringCoding.throwMalformed(StringCoding.java:698)
        at java.base/java.lang.StringCoding.decodeUTF8_0(StringCoding.java:885)
        at java.base/java.lang.StringCoding.newStringUTF8NoRepl(StringCoding.java:978)
        at java.base/java.lang.System$2.newStringUTF8NoRepl(System.java:2270)
        at java.base/java.util.zip.ZipCoder$UTF8.toString(ZipCoder.java:60)
        at java.base/java.util.zip.ZipCoder.toString(ZipCoder.java:87)
        at java.base/java.util.zip.ZipInputStream.readLOC(ZipInputStream.java:302)
        at java.base/java.util.zip.ZipInputStream.getNextEntry(ZipInputStream.java:124)
        at com.mergebase.log4j.Log4JDetector.findLog4jRecursive(Log4JDetector.java:208)
        at com.mergebase.log4j.Log4JDetector.scan(Log4JDetector.java:442)
        at com.mergebase.log4j.Log4JDetector.analyze(Log4JDetector.java:502)
        at com.mergebase.log4j.Log4JDetector.analyze(Log4JDetector.java:497)
        at com.mergebase.log4j.Log4JDetector.main(Log4JDetector.java:84)

The TAR-File itself seems to correct.

Is it possible that there might be problems involving "Multi-Archive"-Archives with perhapes NON UTF-8-encoded Sub-Archives ?

Just a Shot in the Dark ...

volker-graf avatar Dec 14 '21 14:12 volker-graf

It also spits out a couple of java.lang.IllegalArgumentException: MALFORMED and java.io.EOFException at me, for some libjli.so and libzip.so modules and the latter for jexec in an old JRE directory as probably reported by @volker-graf

stefan123t avatar Dec 14 '21 14:12 stefan123t

Latest version probably won't have these errors because it now ignores everything that isn't a zip/ear/jar/war/aar file (with those suffixes). Would that work for you? Or do you think the log4j-detector should enter *.tar files?

(Entering *.tar.gz / *.tar.xz / *.tar.bz2 starts to be a pain since those require temporary disk space, whereas current approach that only enters zip files can do everything in-memory).

juliusmusseau avatar Dec 14 '21 17:12 juliusmusseau

Dear Julius, I have tried it again with 2021-12-16 and it indeed skips tar balls. I only got an Out Of Memory error now after some time, probably because Multipart ZIP files and Self-Extracting Shell ZIP files can not be detected / analyzed succesfully. But the static object modules and the jexec are not reported any more. Thanks for that, it works for me. Dunno about @volker-graf being the Original Poster. Kind regards, Stefan

stefan123t avatar Dec 20 '21 11:12 stefan123t

I got a few "Out Of Memory"-errors but I fixed them by adding -Xmx8G to the cmd-line-arguments.

volker-graf avatar Dec 20 '21 21:12 volker-graf