ort icon indicating copy to clipboard operation
ort copied to clipboard

Scanner freezes when using ClearlyDefined as a scan storage provider

Open martencassel opened this issue 4 years ago • 26 comments

When i run the scanner using clearlydefined with a file based storage, the scanner seem to hang.

I don't know if its a problem using the clearlydefined api or if my configuration file is incorrect ?

Is there any flag to see more information from the scanner, like debug messages ?

The process doesn't seem to do much

sudo sudo strace -f -e trace=network -s 10000 -p `pidof java`
strace: Process 110620 attached with 91 threads
[pid 110770] recvfrom(295, 
 <unfinished ...>
[pid 110767] sendto(295, "\27\3\3\0#\315K\17\366~\3\365\5\37\r\327.N.\274\215y>\264\10\355u\244\277\35K^{9\69\233\305i\"", 40, 0, NULL, 0) = 40
[pid 110767] sendto(295, "\27\3\3\0#\326?|j\201\320\224\257I\316\261\207\333(\\\311>\10\2217x1\255\227\223J\271\377,R\224!>\236\301", 40, 0, NULL, 0) = 40
[pid 110767] shutdown(295, SHUT_WR)     = 0
[pid 110767] shutdown(295, SHUT_RD)     = 0
[pid 110770] <... recvfrom resumed>"", 5, 0, NULL, NULL) = 0
[pid 110770] --- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_ACCERR, si_addr=0x7f231368e008} ---
[pid 110767] +++ exited with 0 +++
[pid 110770] +++ exited with 0 +++
[pid 110620] --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_USER, si_pid=0, si_uid=0} ---
[pid 110620] --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_USER, si_pid=0, si_uid=0} ---
[pid 110620] --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_USER, si_pid=0, si_uid=0} ---
[pid 110620] --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_USER, si_pid=0, si_uid=0} ---
[pid 110620] --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_USER, si_pid=0, si_uid=0} ---
[pid 110620] --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_USER, si_pid=0, si_uid=0} ---
[pid 110620] --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_USER, si_pid=0, si_uid=0} ---
[pid 110620] --- SIGWINCH {si_signo=SIGWINCH, si_code=SI_USER, si_pid=0, si_uid=0} ---
~/scripts/scan.sh 
20:06:35.331 [main] INFO  org.ossreviewtoolkit.model.config.OrtConfiguration - Using ORT configuration file '/ort-home/.ort/ort.conf'.
20:06:35.638 [main] INFO  org.ossreviewtoolkit.model.config.LicenseFilenamePatterns - Configuring the license file patterns.
________ _____________________
\_____  \\______   \__    ___/ the OSS Review Toolkit, version DOCKER-SNAPSHOT.
 /   |   \|       _/ |    |
/    |    \    |   \ |    |    Running 'scan' under Java 11.0.8 on Linux with
\_______  /____|_  / |____|    8 CPUs and a maximum of 7942 MiB of memory.
        \/       \/
Environment variables:
ORT_CONFIG_DIR = /ort-home/.ort
ORT_DATA_DIR = /ort-home/.ort
JAVA_HOME = /opt/java/openjdk
ANDROID_HOME = /opt/android-sdk
GOPATH = /go

20:06:35.682 [main] INFO  org.ossreviewtoolkit.scanner.ScanResultsStorage - Using file based storage with local directory '/ort-home/.ort/scan-results'.
20:06:35.689 [main] INFO  org.ossreviewtoolkit.scanner.ScanResultsStorage - Using ClearlyDefined storage with URL 'https://api.clearlydefined.io'.
20:06:35.694 [main] INFO  org.ossreviewtoolkit.scanner.ScanResultsStorage - Using composite storage with readers FileBasedStorage with XZCompressedLocalFileStorage backend, ClearlyDefinedStorage and writers FileBasedStorage with XZCompressedLocalFileStorage backend.
20:06:35.695 [main] INFO  org.ossreviewtoolkit.scanner.ScanResultsStorage - ScanResultStorage has been configured to composite[readers:[FileBasedStorage with XZCompressedLocalFileStorage backend, ClearlyDefinedStorage], writers:[FileBasedStorage with XZCompressedLocalFileStorage backend]].
Using scan storage 'composite[readers:[FileBasedStorage with XZCompressedLocalFileStorage backend, ClearlyDefinedStorage], writers:[FileBasedStorage with XZCompressedLocalFileStorage backend]]'.
Using scanner 'ScanCode'.
20:06:36.733 [main] INFO  org.ossreviewtoolkit.scanner.scanners.scancode.ScanCode - Searching scan results for 1 packages.
20:06:36.739 [main] INFO  org.ossreviewtoolkit.scanner.scanners.scancode.ScanCode - Searching scan results for 380 packages.
20:06:37.199 [DefaultDispatcher-worker-34] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::github.com/cavaliercoder/badio:v0.0.0-20160213150051-ce5280129e9e'.
20:06:37.199 [DefaultDispatcher-worker-9] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::github.com/anmitsu/go-shlex:v0.0.0-20161002113705-648efa622239'.
20:06:37.199 [DefaultDispatcher-worker-63] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.62.0'.
20:06:37.200 [DefaultDispatcher-worker-7] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::github.com/asaskevich/govalidator:v0.0.0-20200907205600-7a23bdc65eef'.
20:06:37.200 [DefaultDispatcher-worker-37] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::github.com/Masterminds/squirrel:v1.4.0'.
20:06:37.200 [DefaultDispatcher-worker-61] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go/storage:v1.6.0'.
20:06:37.200 [DefaultDispatcher-worker-13] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.65.0'.
20:06:37.200 [DefaultDispatcher-worker-41] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::github.com/Masterminds/goutils:v1.1.0'.
20:06:37.200 [DefaultDispatcher-worker-45] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::github.com/cavaliercoder/go-rpm:v0.0.0-20200122174316-8cb9fd9c31a8'.
20:06:37.200 [DefaultDispatcher-worker-38] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::github.com/alecthomas/jsonschema:v0.0.0-20200530073317-71f438968921'.
20:06:37.200 [DefaultDispatcher-worker-10] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::github.com/BurntSushi/toml:v0.3.1'.
20:06:37.200 [DefaultDispatcher-worker-59] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go/storage:v1.5.0'.
20:06:37.200 [DefaultDispatcher-worker-46] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::code.gitea.io/sdk/gitea:v0.13.0'.
20:06:37.200 [DefaultDispatcher-worker-60] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.76.0'.
20:06:37.200 [DefaultDispatcher-worker-43] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::github.com/Masterminds/sprig:v2.22.0+incompatible'.
20:06:37.201 [DefaultDispatcher-worker-48] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.34.0'.
20:06:37.200 [DefaultDispatcher-worker-18] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::github.com/asaskevich/govalidator:v0.0.0-20200428143746-21a406dcc535'.
20:06:37.201 [DefaultDispatcher-worker-24] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go/storage:v1.0.0'.
20:06:37.201 [DefaultDispatcher-worker-50] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.72.0'.
20:06:37.201 [DefaultDispatcher-worker-30] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go/storage:v1.10.0'.
20:06:37.201 [DefaultDispatcher-worker-33] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.26.0'.
20:06:37.198 [DefaultDispatcher-worker-35] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::github.com/cavaliercoder/go-cpio:v0.0.0-20180626203310-925f9528c45e'.
20:06:37.198 [DefaultDispatcher-worker-36] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::github.com/Masterminds/semver:v1.5.0'.
20:06:37.198 [DefaultDispatcher-worker-64] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go/storage:v1.8.0'.
20:06:37.198 [DefaultDispatcher-worker-62] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.74.0'.
20:06:37.210 [DefaultDispatcher-worker-8] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.57.0'.
20:06:37.225 [DefaultDispatcher-worker-34] INFO  org.ossreviewtoolkit.utils.OrtAuthenticator - Authenticator was successfully installed.
20:06:37.240 [DefaultDispatcher-worker-34] INFO  org.ossreviewtoolkit.utils.OrtProxySelector - Proxy selector was successfully installed.
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by retrofit2.Platform (file:/opt/ort/lib/retrofit-2.9.0.jar) to constructor java.lang.invoke.MethodHandles$Lookup(java.lang.Class,int)
WARNING: Please consider reporting this to the maintainers of retrofit2.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
20:06:37.446 [DefaultDispatcher-worker-30] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.54.0'.
20:06:37.446 [DefaultDispatcher-worker-34] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.44.3'.
20:06:37.446 [DefaultDispatcher-worker-35] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.46.3'.
20:06:37.447 [DefaultDispatcher-worker-30] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.44.2'.
20:06:37.447 [DefaultDispatcher-worker-34] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.44.1'.
20:06:37.447 [DefaultDispatcher-worker-35] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.38.0'.
20:06:37.446 [DefaultDispatcher-worker-61] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.50.0'.
20:06:37.447 [DefaultDispatcher-worker-34] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::github.com/cespare/xxhash/v2:v2.1.1'.
20:06:37.446 [DefaultDispatcher-worker-41] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.52.0'.
20:06:37.447 [DefaultDispatcher-worker-30] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.45.1'.
20:06:37.446 [DefaultDispatcher-worker-50] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.56.0'.
20:06:37.446 [DefaultDispatcher-worker-60] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'GoMod::cloud.google.com/go:v0.53.0'.
cat ~/scripts/scan.sh 
#!/bin/bash

export CURRENT_DIRNAME=$(basename "$PWD")

sudo rm -rf ~/.ort/project/packaged/ort/scanner/native-scan-results

docker run -e ORT_CONFIG_DIR=/ort-home/.ort \
           -e ORT_DATA_DIR=/ort-home/.ort \
           -v $HOME:/ort-home \
           -v $(pwd):/project \
	   -v ~/.gitconfig:/root/.gitconfig \
           ort:latest --info scan -i /ort-home/.ort/project/$CURRENT_DIRNAME/ort/analyzer/analyzer-result.json -o /ort-home/.ort/project/$CURRENT_DIRNAME/ort/scanner
 cat ~/.ort/ort.conf
ort {
  scanner {
    storages {
      clearlyDefined {
        serverUrl = "https://api.clearlydefined.io"
      }
      fileBasedStorage {
        backend {
          localFileStorage {
            directory = "/ort-home/.ort/scan-results"
            compression = true
          }
        }
      }
    }

    storageReaders: [
       "fileBasedStorage", "clearlyDefined"
    ]

    storageWriters: [
      "fileBasedStorage"
    ]
  }
}

martencassel avatar Oct 04 '21 20:10 martencassel

We're also facing timeout issues with ClearlyDefined recently. My hope is that working on https://github.com/oss-review-toolkit/ort/issues/3905 will address this.

sschuberth avatar Oct 05 '21 06:10 sschuberth

I have this timeout isue as well. It also stops scanning after this warnings:

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by retrofit2.Platform (file:/opt/ort/lib/retrofit-2.9.0.jar) to constructor java.lang.invoke.MethodHandles$Lookup(java.lang.Class,int)
WARNING: Please consider reporting this to the maintainers of retrofit2.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release

Sometimes one or two packages are scanned but not more.

Does someone has an idea for workaround?

Thank you very much.

schvvarzekatze avatar Dec 13 '21 14:12 schvvarzekatze

The timeout is unrelated to the bogus "An illegal reflective access operation has occurred" warning. As already explained e.g. here, that warning can simply be ignored.

sschuberth avatar Dec 13 '21 17:12 sschuberth

If I configure to read and write the scan result only on postgres it seems to stop due to a timeout:

3:13:43.225 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running '/usr/local/bin/scancode --copyright --license --info --strip-root --timeout 300 --processes 7 /tmp/ort-ScanCode<id>/Maven/com.azure/azure-core-http-netty/1.5.0 --json-pp /sample-project/build/oss-review-toolkit/gradle/scan/native-scan-results/Maven/com.azure/azure-core-http-netty/1.5.0/scan-results_ScanCode.json' in '/'...

Would it help to configure the timeout of ScanCode

schvvarzekatze avatar Dec 15 '21 13:12 schvvarzekatze

Hmm, I don't see ScanCode running into a timeout in the logs you posted (you should see a message like "Timeout ... while scanning file ..." then).

As Netty is a rather big package IIRC, the scan could simply take long. A ScanCode scan can easily take several hours depending on the code size. PS: The timeout is per file, not per total scan duration.

sschuberth avatar Dec 15 '21 13:12 sschuberth

It does not even continue after more than one hour from this package only on postgres. But also with clearlyDefined it stops after some minutes scanning - not at the same package when I want to reproduce it. It does not continue even after several hours. I have never seen the timeout message in the logs so a longer timeout probably won't be a solution.

ort {
  analyzer {
  }

  scanner {
    storages {

      postgres {
        url = ${POSTGRES_URL}
        schema = "public"
        username = "ort"
        password = ${POSTGRES_PASSWORD}
        sslmode = "disable"
      }

    }

    storageReaders: [
      "postgres"
    ]

    storageWriters: [
      "postgres"
    ]
  }
}
ort {
  analyzer {
  }

  scanner {
    storages {
      clearlyDefined {
       serverUrl = "https://api.clearlydefined.io"
     }

      postgres {
        url = ${POSTGRES_URL}
        schema = "public"
        username = "ort"
        password = ${POSTGRES_PASSWORD}
        sslmode = "disable"
      }

      fileBasedStorage {
              backend {
                localFileStorage {
                  directory = "/ort/analyzer"
                  compression = true
                }
              }
            }
    }

    storageReaders: [
      "fileBasedStorage", "postgres", "clearlyDefined"
    ]

    storageWriters: [
      "fileBasedStorage", "postgres"
    ]
  }
}

schvvarzekatze avatar Dec 15 '21 15:12 schvvarzekatze

The scan passed azure-core-http-nettythis time after 1.5 hours. So it really takes very long. ;) Finally the scan with postgres was successful and took about 5 h.

schvvarzekatze avatar Dec 15 '21 16:12 schvvarzekatze

I'm also facing this freeze issue when using clearlyDefined as storageReader.

...
storageReaders = [
  fileBasedStorage, clearlyDefined
]

storageWriters = [
  fileBasedStorage 
]
...
root@2af0304336d8:/# ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.0  0.0   7244  4104 pts/0    Ss   06:45   0:00 /bin/bash
root        1089  0.8  1.1 14265044 379328 pts/0 Sl+  06:53   0:11 /opt/java/openjdk/bin/java -classpath /opt/ort/lib/*:/opt/ort/plugin/* org.ossreviewtoolkit.cli.OrtMainKt --info scan -i /project/ort/analyzer/analyzer-result.json -o /pro
root        1242  0.0  0.0   7244  3892 pts/1    Ss   07:15   0:00 /bin/bash
root        1252  0.0  0.0   8900  3252 pts/1    R+   07:16   0:00 ps aux

ORT is in Sl+ (interruptible sleep).

Any news on this?

Any news on this?

I guess no one is currently working on this because AFAIK ClearlyDefined uses a very old version of ScanCode by now, and thus it's quite unattractive to use ClearlyDefined as a scan storage reader currently...

sschuberth avatar Feb 09 '22 07:02 sschuberth

I accidentally came across this issue that I fixed, which likely also fixes this issue. Can anyone confirm?

sschuberth avatar Feb 24 '22 08:02 sschuberth

I would like to test this. Any particular projects that i can use as test cases ?

Apart from the scanning not freezing or anything like that, how do i know that this works ?

martencassel avatar Feb 25 '22 17:02 martencassel

I just ran the scanner (main/73af92a4b86cb5e01c0b867e6eb64115ed692bf0) of ort with a https://github.com/tornadoweb/tornado/tree/master with no issues running the scanner.

martencassel avatar Feb 25 '22 18:02 martencassel

Just curios, do you perform any e2e tests of clearlydefined in a pipline or such ?

martencassel avatar Feb 25 '22 18:02 martencassel

Just curios, do you perform any e2e tests of clearlydefined in a pipline or such ?

No, currently not.

sschuberth avatar Feb 25 '22 19:02 sschuberth

I just ran the scanner (main/73af92a4b86cb5e01c0b867e6eb64115ed692bf0) of ort with a https://github.com/tornadoweb/tornado/tree/master with no issues running the scanner.

That's good news. We should probably wait for other affected users to confirm the fix before closing this issue.

sschuberth avatar Feb 25 '22 19:02 sschuberth

I'll give it a try within the next days. With this configuration:

storageReaders = [
  fileBasedStorage, clearlyDefined
]

storageWriters = [
  fileBasedStorage 
]

In our case there no timeout anymore with this configuration using the recent docker image of ORT:

ort {
  analyzer {
  }

  scanner {
    storages {
      clearlyDefined {
        serverUrl = "https://api.clearlydefined.io"
      }

      postgres {
        url = ${POSTGRES_URL}
        schema = "public"
        username = "ort"
        password = ${POSTGRES_PASSWORD}
        sslmode = "disable"
      }
    }

    storageReaders: [
      "postgres", "clearlyDefined"
    ]

    storageWriters: [
      "postgres"
    ]
  }
}

schvvarzekatze avatar Feb 28 '22 08:02 schvvarzekatze

Thanks for these confirmations, I'm closing this then.

sschuberth avatar Feb 28 '22 09:02 sschuberth

Hmm. It's still freezing.

I'm on commit main/ddfbf6289b4aef2b5acaf29dee6dd0718b27b52a and running the scanner locally (after ./gradlew installDist).

Config is

storageReaders = [
  clearlyDefined, fileBasedStorage
]

So clearlyDefined is used first.

As mentioned in issue #5116 I also have the issue on latest dev. Commit ID is 98616cc52da42fbe9cfd2b317e90fd4cc2925d04. Storage readers is defined as [ fileBasedStorage, clearlyDefined ].

Caveat: it actually works on some projects (e.g. Eclipse JGit) but freezes on others (e.g. Eclipse Tycho).

borisbaldassari avatar Mar 01 '22 17:03 borisbaldassari

I tested with https://github.com/tornadoweb/tornado (python project) and it worked fine.

martencassel avatar Mar 01 '22 17:03 martencassel

Confirm freeze after a few minutes with latest dev and clearlyDefined Storage readers is defined as [ fileBasedStorage, clearlyDefined ].

janderssonse avatar Mar 16 '22 08:03 janderssonse

Confirm freeze on a dotnetcore project:

ort {
  analyzer {
    ignoreToolVersions = true
    allowDynamicVersions = true
  }

  scanner {
    skipConcluded = false
    
    storages {
      artifactoryStorage {
        backend {
          httpFileStorage {
            url = "https://repo.mycompany.local/artifactory/scan-results/license-scanner"
            headers {
              X-JFrog-Art-Api = "${ARTIFACTORY_API_TOKEN}"
            }
          }
        }
      }

      clearlyDefined {
        serverUrl = "https://api.clearlydefined.io"
      }      
    }

    storageReaders: [
      "artifactoryStorage", "clearlyDefined"
    ]

    storageWriters: [
      "artifactoryStorage"
    ]
  }
}
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by retrofit2.Platform (file:/opt/ort/lib/retrofit-2.9.0.jar) to constructor java.lang.invoke.MethodHandles$Lookup(java.lang.Class,int)
WARNING: Please consider reporting this to the maintainers of retrofit2.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
07:00:07.015 [DefaultDispatcher-worker-48] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'NuGet::System.Diagnostics.Contracts:4.3.0'.
07:00:07.015 [DefaultDispatcher-worker-47] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'NuGet::Microsoft.NETCore.Targets:1.0.1'.
07:00:07.015 [DefaultDispatcher-worker-38] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'NuGet::System.Data.Common:4.1.0'.
07:00:07.027 [DefaultDispatcher-worker-1] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'NuGet::System.Diagnostics.DiagnosticSource:4.0.0'.
07:00:07.053 [DefaultDispatcher-worker-69] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'NuGet::System.Diagnostics.DiagnosticSource:4.3.0'.
07:00:07.053 [DefaultDispatcher-worker-47] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'NuGet::System.Diagnostics.Process:4.1.0'.
ERROR: Job failed: execution took longer than 1h0m0s seconds

mawl avatar Apr 29 '22 08:04 mawl

For anyone who's still affected by this, can you please share your ORT analyzer result for a public (and as small as possible) project where you can reproduce the scanner freeze?

sschuberth avatar Aug 26 '22 10:08 sschuberth

I still have the freeze for the following package:

11:29:43.338 [DefaultDispatcher-worker-15] INFO  org.ossreviewtoolkit.scanner.storages.ClearlyDefinedStorage - Looking up results for 'Maven:com.github.junrar:junrar:7.5.2'.

schvvarzekatze avatar Aug 30 '22 13:08 schvvarzekatze

Looking up results for 'Maven:com.github.junrar:junrar:7.5.2'.

Just using that single artifact in ClearlyDefinedStorageFunTest still passes. So it's not a matter of a specific artifact being used, but a sum of artifacts that probably triggers some race condition. I'd really need a full analyzer result file for which the hanging in the scanner reproducibly happens.

sschuberth avatar Aug 31 '22 07:08 sschuberth

@sschuberth I just retried it but it is not only a freezing problem with ClearlyDefined.

Scanner logs:

02:51:40.190 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scanning ArtifactProvenance(sourceArtifact=RemoteArtifact(url=https://repo.maven.apache.org/maven2/xml-apis/xml-apis/1.4.01/xml-apis-1.4.01-sources.jar, hash=Hash(value=5502da0c4c0a7916a604356ef19394ccfa30cd4b, algorithm=SHA-1))) (438 of 440)...
02:51:40.190 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scanning RepositoryProvenance(vcsInfo=VcsInfo(type=Subversion, url=http://svn.apache.org/repos/asf/xml/commons, revision=tags/xml-commons-external-1_3_04, path=), resolvedRevision=1905005) (439 of 440)...
02:51:40.190 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scanning ArtifactProvenance(sourceArtifact=RemoteArtifact(url=https://repo.maven.apache.org/maven2/xml-resolver/xml-resolver/1.2/xml-resolver-1.2-sources.jar, hash=Hash(value=6b72432ea1c5296783007365bb7c9a6b39dfb730, algorithm=SHA-1))) (440 of 440)...
02:51:40.193 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scanning ArtifactProvenance(sourceArtifact=RemoteArtifact(url=https://repo.maven.apache.org/maven2/ca/juliusdavies/not-yet-commons-ssl/0.3.9/not-yet-commons-ssl-0.3.9-sources.jar, hash=Hash(value=5a5be43f32001f28c0db16a25ffd5824e4ff8fee, algorithm=SHA-1))) (1 of 443)...
02:51:40.198 [main] INFO  org.ossreviewtoolkit.downloader.Downloader - Trying to download source artifact from https://repo.maven.apache.org/maven2/ca/juliusdavies/not-yet-commons-ssl/0.3.9/not-yet-commons-ssl-0.3.9-sources.jar...
02:51:40.559 [main] INFO  org.ossreviewtoolkit.downloader.Downloader - Successfully unpacked https://repo.maven.apache.org/maven2/ca/juliusdavies/not-yet-commons-ssl/0.3.9/not-yet-commons-ssl-0.3.9-sources.jar to '/tmp/ort-DefaultProvenanceDownloader16752603141512883686'...
02:51:40.561 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scan of ArtifactProvenance(sourceArtifact=RemoteArtifact(url=https://repo.maven.apache.org/maven2/ca/juliusdavies/not-yet-commons-ssl/0.3.9/not-yet-commons-ssl-0.3.9-
[analyzer-result_scan_freeze.odt](https://github.com/oss-review-toolkit/ort/files/9949379/analyzer-result_scan_freeze.odt)
sources.jar, hash=Hash(value=5a5be43f32001f28c0db16a25ffd5824e4ff8fee, algorithm=SHA-1))) with path scanner 'ScanCode' started.
02:51:40.563 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running '/opt/python/shims/scancode --copyright --license --info --strip-root --timeout 300 --processes 7 /tmp/ort-DefaultProvenanceDownloader16752603141512883686 --json-pp /tmp/ort-ScanCode9217542467426676677/result.json' in '/home/ort'...
02:52:01.295 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scan of ArtifactProvenance(sourceArtifact=RemoteArtifact(url=https://repo.maven.apache.org/maven2/ca/juliusdavies/not-yet-commons-ssl/0.3.9/not-yet-commons-ssl-0.3.9-sources.jar, hash=Hash(value=5a5be43f32001f28c0db16a25ffd5824e4ff8fee, algorithm=SHA-1))) with path scanner 'ScanCode' finished.
02:52:01.518 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scanning RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/cglib/cglib.git, revision=, path=), resolvedRevision=cb1c6fb63de87a6e212fbbfe89a4039b2a0da0cd) (2 of 443)...
02:52:01.521 [main] INFO  org.ossreviewtoolkit.downloader.vcs.Git - Trying to fetch only revision 'cb1c6fb63de87a6e212fbbfe89a4039b2a0da0cd' with depth limited to 50.
02:52:02.243 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running 'git checkout cb1c6fb63de87a6e212fbbfe89a4039b2a0da0cd' in '/tmp/ort-DefaultWorkingTreeCache3501329892126098197'...
02:52:02.319 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scan of RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/cglib/cglib.git, revision=, path=), resolvedRevision=cb1c6fb63de87a6e212fbbfe89a4039b2a0da0cd) with path scanner 'ScanCode' started.
02:52:02.322 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running '/opt/python/shims/scancode --copyright --license --info --strip-root --timeout 300 --processes 7 /tmp/ort-DefaultProvenanceDownloader12759187203313322707 --json-pp /tmp/ort-ScanCode18[1933](https://git.intra.xyz.com/xyz/project/-/jobs/665422#L1933)99169704948158/result.json' in '/home/ort'...
02:52:32.115 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scan of RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/cglib/cglib.git, revision=, path=), resolvedRevision=cb1c6fb63de87a6e212fbbfe89a4039b2a0da0cd) with path scanner 'ScanCode' finished.
02:52:32.193 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scanning RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/qos-ch/logback.git, revision=, path=), resolvedRevision=4eb2914cb65214d13e14a9ed3dfb9f044df98358) (3 of 443)...
02:52:32.194 [main] INFO  org.ossreviewtoolkit.downloader.vcs.Git - Trying to fetch only revision '4eb2914cb65214d13e14a9ed3dfb9f044df98358' with depth limited to 50.
02:52:32.894 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running 'git checkout 4eb2914cb65214d13e14a9ed3dfb9f044df98358' in '/tmp/ort-DefaultWorkingTreeCache16301147323002897904'...
02:52:33.263 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scan of RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/qos-ch/logback.git, revision=, path=), resolvedRevision=4eb2914cb65214d13e14a9ed3dfb9f044df98358) with path scanner 'ScanCode' started.
02:52:33.264 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running '/opt/python/shims/scancode --copyright --license --info --strip-root --timeout 300 --processes 7 /tmp/ort-DefaultProvenanceDownloader7422659137122036956 --json-pp /tmp/ort-ScanCode14729230717579736928/result.json' in '/home/ort'...
02:54:59.725 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scan of RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/qos-ch/logback.git, revision=, path=), resolvedRevision=4eb2914cb65214d13e14a9ed3dfb9f044df98358) with path scanner 'ScanCode' finished.
02:55:00.071 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scanning RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/qos-ch/logback-db.git, revision=, path=), resolvedRevision=b5fe1a5be19e4bbfaea101f5a4029d9c37979781) (4 of 443)...
02:55:00.071 [main] INFO  org.ossreviewtoolkit.downloader.vcs.Git - Trying to fetch only revision 'b5fe1a5be19e4bbfaea101f5a4029d9c37979781' with depth limited to 50.
02:55:00.717 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running 'git checkout b5fe1a5be19e4bbfaea101f5a4029d9c37979781' in '/tmp/ort-DefaultWorkingTreeCache11634058159738080768'...
02:55:00.749 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scan of RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/qos-ch/logback-db.git, revision=, path=), resolvedRevision=b5fe1a5be19e4bbfaea101f5a4029d9c37979781) with path scanner 'ScanCode' started.
02:55:00.751 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running '/opt/python/shims/scancode --copyright --license --info --strip-root --timeout 300 --processes 7 /tmp/ort-DefaultProvenanceDownloader15627348782256068571 --json-pp /tmp/ort-ScanCode14050720135703560019/result.json' in '/home/ort'...
02:55:17.128 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scan of RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/qos-ch/logback-db.git, revision=, path=), resolvedRevision=b5fe1a5be19e4bbfaea101f5a4029d9c37979781) with path scanner 'ScanCode' finished.
02:55:17.173 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scanning RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/assertthat/selenium-shutterbug.git, revision=v1.6, path=), resolvedRevision=3f2b4d80acc11e1778eafb3d4bf94fa75f2d8d48) (5 of 443)...
02:55:17.175 [main] INFO  org.ossreviewtoolkit.downloader.vcs.Git - Trying to fetch only revision '3f2b4d80acc11e1778eafb3d4bf94fa75f2d8d48' with depth limited to 50.
02:55:18.072 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running 'git checkout 3f2b4d80acc11e1778eafb3d4bf94fa75f2d8d48' in '/tmp/ort-DefaultWorkingTreeCache4751513326436822356'...
02:55:18.102 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scan of RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/assertthat/selenium-shutterbug.git, revision=v1.6, path=), resolvedRevision=3f2b4d80acc11e1778eafb3d4bf94fa75f2d8d48) with path scanner 'ScanCode' started.
02:55:18.105 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running '/opt/python/shims/scancode --copyright --license --info --strip-root --timeout 300 --processes 7 /tmp/ort-DefaultProvenanceDownloader12479216354442308907 --json-pp /tmp/ort-ScanCode1021319603976230699/result.json' in '/home/ort'...
02:55:29.115 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scan of RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/assertthat/selenium-shutterbug.git, revision=v1.6, path=), resolvedRevision=3f2b4d80acc11e1778eafb3d4bf94fa75f2d8d48) with path scanner 'ScanCode' finished.
02:55:29.150 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scanning RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/auth0/java-jwt.git, revision=, path=), resolvedRevision=3526fcc80ec15038de1a13a2c84d55d2453c835a) (6 of 443)...
02:55:29.151 [main] INFO  org.ossreviewtoolkit.downloader.vcs.Git - Trying to fetch only revision '3526fcc80ec15038de1a13a2c84d55d2453c835a' with depth limited to 50.
02:55:31.298 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running 'git checkout 3526fcc80ec15038de1a13a2c84d55d2453c835a' in '/tmp/ort-DefaultWorkingTreeCache14908445555009094174'...
02:55:31.335 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scan of RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/auth0/java-jwt.git, revision=, path=), resolvedRevision=3526fcc80ec15038de1a13a2c84d55d2453c835a) with path scanner 'ScanCode' started.
02:55:31.337 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running '/opt/python/shims/scancode --copyright --license --info --strip-root --timeout 300 --processes 7 /tmp/ort-DefaultProvenanceDownloader2625284857958280373 --json-pp /tmp/ort-ScanCode12673091786378993328/result.json' in '/home/ort'...
02:55:46.254 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scan of RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/auth0/java-jwt.git, revision=, path=), resolvedRevision=3526fcc80ec15038de1a13a2c84d55d2453c835a) with path scanner 'ScanCode' finished.
02:55:46.287 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scanning RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/Azure/azure-sdk-for-java.git, revision=, path=), resolvedRevision=17d29997d01f69d5401ee2750049dc34d667820f) (7 of 443)...
02:55:46.288 [main] INFO  org.ossreviewtoolkit.downloader.vcs.Git - Trying to fetch only revision '17d29997d01f69d5401ee2750049dc34d667820f' with depth limited to 50.
02:55:51.351 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running 'git checkout 17d29997d01f69d5401ee2750049dc34d667820f' in '/tmp/ort-DefaultWorkingTreeCache15408588066732132080'...
02:56:15.073 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scan of RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/Azure/azure-sdk-for-java.git, revision=, path=), resolvedRevision=17d29997d01f69d5401ee2750049dc34d667820f) with path scanner 'ScanCode' started.
02:56:15.075 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running '/opt/python/shims/scancode --copyright --license --info --strip-root --timeout 300 --processes 7 /tmp/ort-DefaultProvenanceDownloader7839566481639989688 --json-pp /tmp/ort-ScanCode6138186978180172915/result.json' in '/home/ort'...
05:50:47.103 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scan of RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=https://github.com/Azure/azure-sdk-for-java.git, revision=, path=), resolvedRevision=17d29997d01f69d5401ee2750049dc34d667820f) with path scanner 'ScanCode' finished.
05:51:02.419 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scanning RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=ssh://[email protected]/Azure/azure-sdk-for-java.git, revision=, path=), resolvedRevision=d270b2412344651daefdca1dba4f5d8e60879cb5) (8 of 443)...
05:51:02.456 [main] INFO  org.ossreviewtoolkit.downloader.vcs.Git - Trying to fetch only revision 'd270b2412344651daefdca1dba4f5d8e60879cb5' with depth limited to 50.
05:51:08.820 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running 'git checkout d270b2412344651daefdca1dba4f5d8e60879cb5' in '/tmp/ort-DefaultWorkingTreeCache2293279998572604305'...
05:51:41.684 [main] INFO  org.ossreviewtoolkit.scanner.Scanner - Scan of RepositoryProvenance(vcsInfo=VcsInfo(type=Git, url=ssh://[email protected]/Azure/azure-sdk-for-java.git, revision=, path=), resolvedRevision=d270b2412344651daefdca1dba4f5d8e60879cb5) with path scanner 'ScanCode' started.
05:51:41.704 [main] INFO  org.ossreviewtoolkit.utils.common.ProcessCapture - Running '/opt/python/shims/scancode --copyright --license --info --strip-root --timeout 300 --processes 7 /tmp/ort-DefaultProvenanceDownloader18329071032314066616 --json-pp /tmp/ort-ScanCode163400586687[1952](https://git.intra.xyz.com/xyz/project/-/jobs/665422#L1952)4913/result.json' in '/home/ort'...

I used the following ort config:

ort:
	licenseFilePatterns:
    licenseFilenames: ['license*']
    patentFilenames: [patents]
    rootLicenseFilenames: ['readme*']

	severeIssueThreshold: ERROR
	severeRuleViolationThreshold: ERROR

	enableRepositoryPackageCurations: true
	enableRepositoryPackageConfigurations: true

	analyzer:
        allowDynamicVersions: true

		# A list of enabled package managers.
		enabledPackageManagers: [Yarn, Gradle]

		packageManagers:
			Gradle:
                # A list of package manager names that this package manager must run after. For example, this can be used if
                # another package manager generates files that this package manager requires to run correctly.
				#mustRunAfter: [Yarn]
				options:
					directDependenciesOnly: true
					disableRegistryCertificateVerification: true

			Yarn:
				options:
					# If set to true, disable verification of HTTPS certificate of remote registries. Useful when using a proxy to
					# intercept requests to the registry.
					disableRegistryCertificateVerification: true
					directDependenciesOnly: true

	downloader:
		allowMovingRevisions: true

          # Only used by the CLI tool when the '--license-classifications-file' option is specified.
		includedLicenseCategories:
			- copyleft
			- copyleft-provide-sourcecode
			- weak-copyleft
			- weak-copyleft-provide-sourcecode
			- proprietary
			- permissive
			- public-domain
			- no-assertion
			- not-for-commercial-use
			- include-in-notice-file
			- include-source-code-offer-in-notice-file

		sourceCodeOrigins: [VCS, ARTIFACT]

	scanner:
        skipConcluded: true

		archive:
			enabled: false

		postgresStorage:
			type: "PACKAGE_BASED"
			connection:
				url: ${POSTGRES_URL}
				schema: public
				username: ort
				password: ${POSTGRES_PASSWORD}
				sslmode: disable
				parallelTransactions: 5

		createMissingArchives = false

		detectedLicenseMapping:
			LGPL v2.1: 'LGPL-2.1-only'

        options:
            # A map of maps from scanner class names to scanner-specific key-value pairs.
            ScanCode:
                commandLine: '--copyright --license --info --strip-root --timeout 300'
                parseLicenseExpressions: true

                # Criteria for matching stored scan results. These can be configured for any scanner that uses semantic
                # versioning. Note that the 'maxVersion' is exclusive and not part of the range of accepted versions.
                minVersion: '3.2.1-rc2'
                maxVersion: '30.2.0'

		storages:
			clearlyDefined:
				serverUrl: 'https://api.clearlydefined.io'

			postgres:
                connection:
					url: ${POSTGRES_URL}
					schema: public
					username: ort
					password: ${POSTGRES_PASSWORD}
					sslmode: disable

		storageReaders: [postgres, clearlyDefined]

		storageWriters: [postgres]

I also attached the analyzer-result (yml is not possible as file attachment).

schvvarzekatze avatar Nov 07 '22 08:11 schvvarzekatze

@schvvarzekatze could be that you issue is unrelated. You're, amongst others, scanning the Azure SDK, which is massive, and will take hours to days to scan. Also, you should upgrade to the new provenance-based scanner infrastructure and configure a provenanceStorage instead of a storage as soon as possible to avoid rescans of mono-repos.

sschuberth avatar Nov 07 '22 08:11 sschuberth

Thank you, it also seems to freeze without clearly defined. I just try it now with provenance storage

schvvarzekatze avatar Nov 07 '22 09:11 schvvarzekatze

I just try it now with provenance storage

Note that the initial scan will still take long.

sschuberth avatar Nov 07 '22 09:11 sschuberth