incubator-devlake icon indicating copy to clipboard operation
incubator-devlake copied to clipboard

[Bug][GitExtractor] GitExtractor fails with "Invalid Git URL" when using GitHub App authentication

Open cedriclecoz opened this issue 1 month ago • 1 comments

Search before asking

  • [x] I had searched in the issues and found no similar issues.

What happened

When using GitHub App authentication to configure a GitHub connection in DevLake, the gitextractor task fails during project import with an "Invalid Git URL" error. This issue does not occur when using Personal Access Token (PAT) authentication.

  • The gitextractor task fails immediately with "Invalid Git URL" error
  • The CloneUrl field in _tool_github_repos is empty/null
  • A malformed URL is constructed: //git:ghs_XXXXX@ (missing hostname and repository path)

What do you expect to happen

The gitextractor task should successfully clone the repository using the GitHub App credentials

How to reproduce

  1. Configure a GitHub connection using GitHub App authentication (App ID + Private Key + Installation ID)
  2. Successfully test the connection (connection test passes)
  3. Create a project using that connection to get data on a GitHub repo.
  4. Run a collection.
  5. Observe that the gitextractor task fails with "Invalid Git URL" error

Github App was created with permissions described in https://devlake.apache.org/docs/Configuration/GitHub#github-apps-beta set as Read-Only.

I am using v1.0.3-beta8 docker compose.

Anything else

time="2025-11-14 13:17:10" level=debug msg="plan[1][1] is &{Plugin:gitextractor Subtasks:[] Options:map[connectionId:1 fullName:example-org/example-repo name:example-org/example-repo pluginName:github proxy: repoId:github:GithubRepo:1:12345678 url://git:ghs_XXXXXXXXXXXXXXXXXXXXXXXXXXXX@]}\n"

Notice the malformed URL: url://git:ghs_XXXXXXXXXXXXXXXXXXXXXXXXXXXX@ (missing hostname and repository path)

Full Error Stack Trace:

time="2025-11-14 13:17:11" level=error msg=" [task service] task failed
	caused by: attached stack trace
	  -- stack trace:
	  | github.com/apache/incubator-devlake/server/services.RunTasksStandalone.func1
	  | 	/app/server/services/task.go:189
	Wraps: (2) Error running task 85.
	Wraps: (3) attached stack trace
	  -- stack trace:
	  | github.com/apache/incubator-devlake/core/runner.RunPluginSubTasks
	  | 	/app/core/runner/run_task.go:250
	  | github.com/apache/incubator-devlake/core/runner.RunPluginTask
	  | 	/app/core/runner/run_task.go:165
	  | github.com/apache/incubator-devlake/core/runner.RunTask
	  | 	/app/core/runner/run_task.go:139
	  | github.com/apache/incubator-devlake/server/services.runTaskStandalone
	  | 	/app/server/services/task_runner.go:114
	  | github.com/apache/incubator-devlake/server/services.RunTasksStandalone.func1
	  | 	/app/server/services/task.go:187
	Wraps: (4) error preparing task data for gitextractor
	Wraps: (5) attached stack trace
	  -- stack trace:
	  | github.com/apache/incubator-devlake/plugins/gitextractor/impl.GitExtractor.PrepareTaskData
	  | 	/app/plugins/gitextractor/impl/impl.go:85
	  | [...repeated from below...]
	Wraps: (6) failed to get Git URL
	Wraps: (7) attached stack trace
	  -- stack trace:
	  | github.com/apache/incubator-devlake/plugins/github/impl.replaceAcessTokenInUrl
	  | 	/app/plugins/github/impl/impl.go:355
	  | github.com/apache/incubator-devlake/plugins/github/impl.Github.GetDynamicGitUrl
	  | 	/app/plugins/github/impl/impl.go:270
	  | github.com/apache/incubator-devlake/plugins/gitextractor/impl.GitExtractor.PrepareTaskData
	  | 	/app/plugins/gitextractor/impl/impl.go:83
	  | github.com/apache/incubator-devlake/core/runner.RunPluginSubTasks
	  | 	/app/core/runner/run_task.go:248
	  | github.com/apache/incubator-devlake/core/runner.RunPluginTask
	  | 	/app/core/runner/run_task.go:165
	  | github.com/apache/incubator-devlake/core/runner.RunTask
	  | 	/app/core/runner/run_task.go:139
	  | github.com/apache/incubator-devlake/server/services.runTaskStandalone
	  | 	/app/server/services/task_runner.go:114
	  | github.com/apache/incubator-devlake/server/services.RunTasksStandalone.func1
	  | 	/app/server/services/task.go:187
	  | runtime.goexit
	  | 	/usr/local/go/src/runtime/asm_amd64.s:1598
	Wraps: (8) Invalid Git URL
	Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.withPrefix (7) *withstack.withStack (8) *errutil.leafError"

Version

v1.0.3-beta8@cfe519c

Are you willing to submit PR?

  • [ ] Yes I am willing to submit a PR!

Code of Conduct

cedriclecoz avatar Nov 14 '25 14:11 cedriclecoz

I think that might have been due to the database already pre-populated with data from using a GitHub PAT. I've since dropped all the tables in my DB and retriggered a fresh collection using the App only (PAT not configured anymore) and it worked ok. so there is something strange somewhere, but definitely not a p0.

cedriclecoz avatar Nov 24 '25 10:11 cedriclecoz