ort icon indicating copy to clipboard operation
ort copied to clipboard

Scanner fails to pull Git submodules

Open voidpetal opened this issue 6 months ago • 2 comments

Describe the bug

When there is a repo with Git submodules, the scanner fails to pull them and produces Git submodule at 'app/libraries' not initialized warning. Then the submodule is not scanned.

This is the example repo with the submodule warning:

To Reproduce

Steps to reproduce the behavior:

  1. Run scanner on main repo
  2. See error

Expected behavior

The submodule is initialized correctly and the repository is fully scanned.

Console / log output

[DefaultDispatcher-worker-1] WARN  org.ossreviewtoolkit.plugins.versioncontrolsystems.git.GitWorkingTree - Git submodule at 'app/libraries' not initialized. Cannot recursively list its submodules.

Environment

  • ORT version: 59.3.0
  • Java version: 21
  • OS: MacOS, Linux

voidpetal avatar Jun 03 '25 10:06 voidpetal

Can you share the analyzer result for convenience?

sschuberth avatar Jun 03 '25 11:06 sschuberth

Sure, here you can find the analyzer and scan results, appended .txt, as .yml file upload is not allowed.

analyzer-result.yml.txt scan-result.yml.txt

voidpetal avatar Jun 04 '25 08:06 voidpetal

Hi @sschuberth, any updates on this? Thanks!

voidpetal avatar Jul 04 '25 07:07 voidpetal

No, I'm not aware of anyone working on this. If you want to prioritize this, feel free to reach out to me by mail (you can get it from the Git history) to discuss some options. I'm also in contact with @bennati in this regard.

sschuberth avatar Jul 04 '25 08:07 sschuberth

I see that the syntax being used in your .gitmodules file is

[submodule "app/libraries"]
	path = app/libraries
	url = ../ort-test-libs

So the url is relative. While according to the spec that's allowed, it's a bit unusual, and I wonder whether the problem goes away when using regular absolute URLs.

sschuberth avatar Aug 06 '25 12:08 sschuberth

Ah, there we have it, I believe, looking at the analyzer result's repository section:

repository:
  vcs:
    type: "Git"
    url: "https://gitlab.com/sergey.burnevsky/ort-test-main.git"
    revision: "fa1b28e775d5cff417508b27975b380977d97d6f"
    path: ""
  vcs_processed:
    type: "Git"
    url: "https://gitlab.com/sergey.burnevsky/ort-test-main.git"
    revision: "fa1b28e775d5cff417508b27975b380977d97d6f"
    path: ""
  nested_repositories:
    app/libraries:
      type: "Git"
      url: "[email protected]:sergey.burnevsky/ort-test-libs"
      revision: "6095c61350564683cc2d93fd1175e1b15067ca5f"
      path: ""
  config: {}

So the nested_repositories is properly recognized, however the relative URL is resolved to use SSH instead of HTTPS as the transport. As I guess that no SSH credentials were configured, naturally the submodule checkout fails.

The underlying question now is: Why does the submodule use SSH, if the superproject’s origin repository uses HTTPS? Guessing further, maybe you have some kind of URL replacement configured, which does not work for submodules (could be related to https://github.com/oss-review-toolkit/ort/issues/8918).

sschuberth avatar Aug 06 '25 12:08 sschuberth

Hi @sschuberth, thanks for looking into this!

Seems like the git submodule was initialized locally with SSH, and this got reproduced in the remote. After changing the SSH to HTTPS URL with git submodule set-url app/libraries https://gitlab.com/sergey.burnevsky/ort-test-libs.git the submodule warning seems to have disappeared.

However we are still encountering ann issue that the submodule is duplicated as reported here

voidpetal avatar Aug 12 '25 12:08 voidpetal

However we are still encountering ann issue that the submodule is duplicated as reported here

Ok, but as that's tracked separately, are we good to close this issue?

sschuberth avatar Aug 27 '25 15:08 sschuberth

Absolutely :)

voidpetal avatar Aug 28 '25 10:08 voidpetal