ort icon indicating copy to clipboard operation
ort copied to clipboard

Downloader: Skip zero size artifacts

Open timo-HERE opened this issue 1 year ago • 6 comments

Sometimes ORT comes across with zero size source artifacts. Those obviously are errors that should be fixed/deleted from the repository, but I think ORT also should gracefully ignore those files instead of trying to unpack 0 byte files.

ERROR org.ossreviewtoolkit.downloader.Downloader - Could not unpack source artifact '/tmp/ort-Downloader2786093702508072637/xmlpull-1.1.3.1-sources.jar': IOException: Unable to unpack '/tmp/ort-Downloader2786093702508072637/xmlpull-1.1.3.1-sources.jar'. This file is not a supported archive type.
Suppressed: IOException: Unpacking '/tmp/ort-Downloader2786093702508072637/xmlpull-1.1.3.1-sources.jar' as ZIP failed.
    Caused by: IOException: Error on ZipFile /tmp/ort-Downloader2786093702508072637/xmlpull-1.1.3.1-sources.jar
        Caused by: ZipException: Archive is not a ZIP archiveSuppressed: IOException: Unpacking '/tmp/ort-Downloader2786093702508072637/xmlpull-1.1.3.1-sources.jar' as SEVENZIP failed.
    Caused by: EOFException: nullSuppressed: IOException: Unpacking '/tmp/ort-Downloader2786093702508072637/xmlpull-1.1.3.1-sources.jar' as TAR_BZIP2 failed.
    Caused by: IOException: Stream is not in the BZip2 formatSuppressed: IOException: Unpacking '/tmp/ort-Downloader2786093702508072637/xmlpull-1.1.3.1-sources.jar' as TAR_GZIP failed.
    Caused by: IOException: Input is not in the .gz formatSuppressed: IOException: Unpacking '/tmp/ort-Downloader2786093702508072637/xmlpull-1.1.3.1-sources.jar' as TAR_XZ failed.
    Caused by: EOFException: nullSuppressed: IOException: Unpacking '/tmp/ort-Downloader2786093702508072637/xmlpull-1.1.3.1-sources.jar' as TAR failed.
    Caused by: IOException: Unsupported archive type or empty archive.Suppressed: IOException: Unpacking '/tmp/ort-Downloader2786093702508072637/xmlpull-1.1.3.1-sources.jar' as DEB failed.
    Caused by: IOException: Failed to read header. Occurred at byte: 0
09:50:35.435 [main] ERROR org.ossreviewtoolkit.downloader.Downloader - Could not unpack source artifact '/tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar': IOException: Unable to unpack '/tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar'. This file is not a supported archive type.
Suppressed: IOException: Unpacking '/tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar' as ZIP failed.
    Caused by: IOException: Error on ZipFile /tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar
        Caused by: ZipException: Archive is not a ZIP archiveSuppressed: IOException: Unpacking '/tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar' as SEVENZIP failed.
    Caused by: EOFException: nullSuppressed: IOException: Unpacking '/tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar' as TAR_BZIP2 failed.
    Caused by: IOException: Stream is not in the BZip2 formatSuppressed: IOException: Unpacking '/tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar' as TAR_GZIP failed.
    Caused by: IOException: Input is not in the .gz formatSuppressed: IOException: Unpacking '/tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar' as TAR_XZ failed.
    Caused by: EOFException: nullSuppressed: IOException: Unpacking '/tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar' as TAR failed.
    Caused by: IOException: Unsupported archive type or empty archive.Suppressed: IOException: Unpacking '/tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar' as DEB failed.
    Caused by: IOException: Failed to read header. Occurred at byte: 0
Exception in thread "main" org.ossreviewtoolkit.downloader.DownloadException: java.io.IOException: Unable to unpack '/tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar'. This file is not a supported archive type.
	at org.ossreviewtoolkit.downloader.Downloader.downloadSourceArtifact(Downloader.kt:377)
	at org.ossreviewtoolkit.downloader.Downloader.downloadSourceArtifact$default(Downloader.kt:303)
	at org.ossreviewtoolkit.scanner.provenance.DefaultProvenanceDownloader.download(ProvenanceDownloader.kt:68)
	at org.ossreviewtoolkit.scanner.Scanner.downloadRecursively(Scanner.kt:745)
	at org.ossreviewtoolkit.scanner.Scanner.createMissingArchives(Scanner.kt:721)
	at org.ossreviewtoolkit.scanner.Scanner.scan(Scanner.kt:181)
	at org.ossreviewtoolkit.scanner.Scanner$scan$3.invokeSuspend(Scanner.kt)
	at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
	at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:108)
	at kotlinx.coroutines.EventLoopImplBase.processNextEvent(EventLoop.common.kt:280)
	at kotlinx.coroutines.BlockingCoroutine.joinBlocking(Builders.kt:85)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking(Builders.kt:59)
	at kotlinx.coroutines.BuildersKt.runBlocking(Unknown Source)
	at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default(Builders.kt:38)
	at kotlinx.coroutines.BuildersKt.runBlocking$default(Unknown Source)
	at org.ossreviewtoolkit.plugins.commands.scanner.ScannerCommand.runScanners(ScannerCommand.kt:227)
	at org.ossreviewtoolkit.plugins.commands.scanner.ScannerCommand.run(ScannerCommand.kt:140)
	at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:306)
	at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:319)
	at com.github.ajalt.clikt.parsers.Parser.parse(Parser.kt:40)
	at com.github.ajalt.clikt.core.CliktCommand.parse(CliktCommand.kt:458)
	at com.github.ajalt.clikt.core.CliktCommand.parse$default(CliktCommand.kt:455)
	at com.github.ajalt.clikt.core.CliktCommand.main(CliktCommand.kt:475)
	at com.github.ajalt.clikt.core.CliktCommand.main(CliktCommand.kt:482)
	at org.ossreviewtoolkit.cli.OrtMainKt.main(OrtMain.kt:85)
Caused by: java.io.IOException: Unable to unpack '/tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar'. This file is not a supported archive type.
	at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackTryAllTypes(ArchiveUtils.kt:119)
	at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackTryAllTypes$default(ArchiveUtils.kt:105)
	at org.ossreviewtoolkit.downloader.Downloader.downloadSourceArtifact(Downloader.kt:369)
	... 24 more
	Suppressed: java.io.IOException: Unpacking '/tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar' as ZIP failed.
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackTryAllTypes(ArchiveUtils.kt:117)
		... 26 more
	Caused by: java.io.IOException: Error on ZipFile /tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar
		at org.apache.commons.compress.archivers.zip.ZipFile.<init>(ZipFile.java:655)
		at org.apache.commons.compress.archivers.zip.ZipFile.<init>(ZipFile.java:552)
		at org.apache.commons.compress.archivers.zip.ZipFile.<init>(ZipFile.java:531)
		at org.apache.commons.compress.archivers.zip.ZipFile.<init>(ZipFile.java:458)
		at org.apache.commons.compress.archivers.zip.ZipFile.<init>(ZipFile.java:[446](https://main.gitlab.in.here.com/oss/oss-review-toolkit/ort-gitlab-ci/-/jobs/87866910#L446))
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackZip(ArchiveUtils.kt:171)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpack(ArchiveUtils.kt:86)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackTryAllTypes(ArchiveUtils.kt:113)
		... 26 more
	Caused by: java.util.zip.ZipException: Archive is not a ZIP archive
		at org.apache.commons.compress.archivers.zip.ZipFile.positionAtEndOfCentralDirectoryRecord(ZipFile.java:1146)
		at org.apache.commons.compress.archivers.zip.ZipFile.positionAtCentralDirectory(ZipFile.java:1034)
		at org.apache.commons.compress.archivers.zip.ZipFile.populateFromCentralDirectory(ZipFile.java:1009)
		at org.apache.commons.compress.archivers.zip.ZipFile.<init>(ZipFile.java:648)
		... 33 more
	Suppressed: java.io.IOException: Unpacking '/tmp/ort-Downloader781122818322[462](https://main.gitlab.in.here.com/oss/oss-review-toolkit/ort-gitlab-ci/-/jobs/87866910#L462)8765/xmlpull-1.1.3.1-sources.jar' as SEVENZIP failed.
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackTryAllTypes(ArchiveUtils.kt:117)
		... 26 more
	Caused by: java.io.EOFException
		at org.apache.commons.compress.utils.IOUtils.readFully(IOUtils.java:278)
		at org.apache.commons.compress.archivers.sevenz.SevenZFile.readFully(SevenZFile.java:1299)
		at org.apache.commons.compress.archivers.sevenz.SevenZFile.readHeaders(SevenZFile.java:1335)
		at org.apache.commons.compress.archivers.sevenz.SevenZFile.<init>(SevenZFile.java:[481](https://main.gitlab.in.here.com/oss/oss-review-toolkit/ort-gitlab-ci/-/jobs/87866910#L481))
		at org.apache.commons.compress.archivers.sevenz.SevenZFile.<init>(SevenZFile.java:332)
		at org.apache.commons.compress.archivers.sevenz.SevenZFile.<init>(SevenZFile.java:345)
		at org.apache.commons.compress.archivers.sevenz.SevenZFile.<init>(SevenZFile.java:291)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpack7Zip(ArchiveUtils.kt:129)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpack(ArchiveUtils.kt:85)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackTryAllTypes(ArchiveUtils.kt:113)
		... 26 more
	Suppressed: java.io.IOException: Unpacking '/tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar' as TAR_BZIP2 failed.
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackTryAllTypes(ArchiveUtils.kt:117)
		... 26 more
	Caused by: java.io.IOException: Stream is not in the BZip2 format
		at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.init(BZip2CompressorInputStream.java:567)
		at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.<init>(BZip2CompressorInputStream.java:293)
		at org.apache.commons.compress.compressors.bzip2.BZip2CompressorInputStream.<init>(BZip2CompressorInputStream.java:271)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpack(ArchiveUtils.kt:89)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackTryAllTypes(ArchiveUtils.kt:113)
		... 26 more
	Suppressed: java.io.IOException: Unpacking '/tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar' as TAR_GZIP failed.
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackTryAllTypes(ArchiveUtils.kt:117)
		... 26 more
	Caused by: java.io.IOException: Input is not in the .gz format
		at org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.init(GzipCompressorInputStream.java:242)
		at org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.<init>(GzipCompressorInputStream.java:189)
		at org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream.<init>(GzipCompressorInputStream.java:153)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpack(ArchiveUtils.kt:90)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackTryAllTypes(ArchiveUtils.kt:113)
		... 26 more
	Suppressed: java.io.IOException: Unpacking '/tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar' as TAR_XZ failed.
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackTryAllTypes(ArchiveUtils.kt:117)
		... 26 more
	Caused by: java.io.EOFException
		at java.base/java.io.DataInputStream.readFully(DataInputStream.java:203)
		at java.base/java.io.DataInputStream.readFully(DataInputStream.java:172)
		at org.tukaani.xz.SingleXZInputStream.readStreamHeader(Unknown Source)
		at org.tukaani.xz.SingleXZInputStream.<init>(Unknown Source)
		at org.tukaani.xz.SingleXZInputStream.<init>(Unknown Source)
		at org.tukaani.xz.SingleXZInputStream.<init>(Unknown Source)
		at org.apache.commons.compress.compressors.xz.XZCompressorInputStream.<init>(XZCompressorInputStream.java:134)
		at org.apache.commons.compress.compressors.xz.XZCompressorInputStream.<init>(XZCompressorInputStream.java:102)
		at org.apache.commons.compress.compressors.xz.XZCompressorInputStream.<init>(XZCompressorInputStream.java:79)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpack(ArchiveUtils.kt:91)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackTryAllTypes(ArchiveUtils.kt:113)
		... 26 more
	Suppressed: java.io.IOException: Unpacking '/tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar' as TAR failed.
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackTryAllTypes(ArchiveUtils.kt:117)
		... 26 more
	Caused by: java.io.IOException: Unsupported archive type or empty archive.
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpack(ArchiveUtils.kt:305)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackTar(ArchiveUtils.kt:259)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpack(ArchiveUtils.kt:88)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackTryAllTypes(ArchiveUtils.kt:113)
		... 26 more
	Suppressed: java.io.IOException: Unpacking '/tmp/ort-Downloader7811228183224628765/xmlpull-1.1.3.1-sources.jar' as DEB failed.
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackTryAllTypes(ArchiveUtils.kt:117)
		... 26 more
	Caused by: java.io.IOException: Failed to read header. Occurred at byte: 0
		at org.apache.commons.compress.archivers.ar.ArArchiveInputStream.getNextArEntry(ArArchiveInputStream.java:264)
		at org.apache.commons.compress.archivers.ar.ArArchiveInputStream.getNextEntry(ArArchiveInputStream.java:350)
		at org.apache.commons.compress.archivers.ar.ArArchiveInputStream.getNextEntry(ArArchiveInputStream.java:36)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpack(ArchiveUtils.kt:278)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackDeb(ArchiveUtils.kt:226)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpack(ArchiveUtils.kt:93)
		at org.ossreviewtoolkit.utils.common.ArchiveUtilsKt.unpackTryAllTypes(ArchiveUtils.kt:113)
		... 26 more

timo-HERE avatar Jan 15 '24 20:01 timo-HERE

Those obviously are errors that should be fixed/deleted from the repository, but I think ORT also should gracefully ignore those files instead of trying to unpack 0 byte files.

How would a user become aware of these errors in order to fix them if ORT would "gracefully ignore those files"? Or do you mean that an error should be logged, but no exception should be thrown?

But actually, I've not seen this before myself, so I'm not sure how worthwhile a special handling for 0-byte files would be.

sschuberth avatar Jan 15 '24 20:01 sschuberth

This can be shown as an issue, but the scan should continue. Current behaviour is that this exception stops the execution.

timo-HERE avatar Jan 15 '24 21:01 timo-HERE

Most likely the root cause was an issue with osgeo repository (repo.osgeo.org) which happened last year.

Discussion: https://www.mail-archive.com/[email protected]/msg20339.html Issue ticket: https://trac.osgeo.org/osgeo/ticket/2978

timo-HERE avatar Feb 07 '24 21:02 timo-HERE

I've also just came across this. What surprises me, is that the logs look as if the exception is not caught. But, looking at the code - it should be caught. So, maybe the exception in the log is not the root cause of the crash ?

See: https://github.com/oss-review-toolkit/ort/blob/bf5661b1853e4febb182cc615215387f27b3983f/scanner/src/main/kotlin/Scanner.kt#L705-L719

fviernau avatar Feb 19 '24 08:02 fviernau

I guess https://github.com/oss-review-toolkit/ort/pull/8303 helps?

sschuberth avatar Feb 19 '24 08:02 sschuberth

@timo-HERE could you please check whether the issue is resolved for you with the latest main? I.e. 0-byte artifacts are still not skipped explicitly, but coming across them should now create an issue entry instead of throwing an exception.

sschuberth avatar Feb 19 '24 09:02 sschuberth

Feel free to reopen if the issue persists.

sschuberth avatar May 02 '24 15:05 sschuberth