[Bug][GitHub GraphQL] Fetching Pull Requests fails with HTTP 502
Search before asking
- [x] I had searched in the issues and found no similar issues.
What happened
In one of our projects, the GitHub GraphQL PR Collector fails reproducible due to HTTP 502. I already found this old bug #3708 and tried to further reduce the page size, but without success. It always seems to happen at the same collector page (depending on page size).
What do you expect to happen
I would expect that the Job doesn't fail or at least gives more valuable information in the logs about the query which failed. I only see the response body (standard nginx 502 gateway error).
How to reproduce
This seems to related to actual data of one private repository, so I can't share a way to reproduce it. I'm not familiar to Go programming, but can do a local docker backend rebuild and would appreciate getting help to further boil this down.
Anything else
No response
Version
v1.0.2-rc1
Are you willing to submit PR?
- [x] Yes I am willing to submit a PR!
Code of Conduct
- [x] I agree to follow this project's Code of Conduct
Noted. Please take a look @klesh
5xx errors indicate a server-side issue—in this case, with the GitHub API—where the server encounters a problem it cannot properly handle. These errors can be triggered by various factors such as page size or other request-specific conditions, so it's difficult to say definitively whether you're encountering the same issue as described in #3708.
Ideally, GitHub should address the underlying cause of the 5xx error and return a more appropriate response, such as a 4xx error indicating the request size is too large or something along those lines. I recommend reporting this issue to GitHub so they can investigate further.
That said, if you’d like to dig deeper into this on your end, feel free to reach out—I’m happy to help however I can.
Not sure what changed, but this error didn't occur any more after a while.
With a second clean installation with the same project setup, the error is occurring again, so I reopen the ticket.
It would be helpful if I can get DevLake to at least skip the PR and continue with the job.
I get the following stacktrace:
attached stack trace -- stack trace:
| github.com/apache/incubator-devlake/server/services.RunTasksStandalone
| /app/server/services/task.go:217
| github.com/apache/incubator-devlake/server/services.(*pipelineRunner).runPipelineStandalone.func1
| /app/server/services/pipeline_runner.go:42
| github.com/apache/incubator-devlake/core/runner.runPipelineTasks
| /app/core/runner/run_pipeline.go:90
| github.com/apache/incubator-devlake/core/runner.RunPipeline
| /app/core/runner/run_pipeline.go:54
| github.com/apache/incubator-devlake/server/services.(*pipelineRunner).runPipelineStandalone
| /app/server/services/pipeline_runner.go:38
| github.com/apache/incubator-devlake/server/services.runPipeline
| /app/server/services/pipeline_runner.go:76
| github.com/apache/incubator-devlake/server/services.RunPipelineInQueue.func1
| /app/server/services/pipeline.go:360
| runtime.goexit
| /usr/local/go/src/runtime/asm_amd64.s:1598 Wraps: (4) attached stack trace
| -- stack trace:
|
| github.com/apache/incubator-devlake/server/services.RunTasksStandalone.func1
|
| /app/server/services/task.go:189
| Wraps: (2) Error running task 309.
| Wraps: (3) attached stack trace
| -- stack trace:
|
| github.com/apache/incubator-devlake/core/runner.RunPluginSubTasks
|
| /app/core/runner/run_task.go:333
|
| [...repeated from below...]
| Wraps: (4) subtask Collect Pull Requests ended unexpectedly
| Wraps: (5) attached stack trace
| -- stack trace:
|
| github.com/apache/incubator-devlake/helpers/pluginhelper/api.(*GraphqlCollector).Execute
|
| /app/helpers/pluginhelper/api/graphql_collector.go:189
|
| github.com/apache/incubator-devlake/helpers/pluginhelper/api.(*StatefulApiCollector).Execute
|
| /app/helpers/pluginhelper/api/api_collector_stateful.go:97
|
| github.com/apache/incubator-devlake/plugins/github_graphql/tasks.CollectPrs
|
| /app/plugins/github_graphql/tasks/pr_collector.go:282
|
| github.com/apache/incubator-devlake/core/runner.runSubtask
|
| /app/core/runner/run_task.go:425
|
| github.com/apache/incubator-devlake/core/runner.RunPluginSubTasks
|
| /app/core/runner/run_task.go:330
|
| github.com/apache/incubator-devlake/core/runner.RunPluginTask
|
| /app/core/runner/run_task.go:165
|
| github.com/apache/incubator-devlake/core/runner.RunTask
|
| /app/core/runner/run_task.go:139
|
| github.com/apache/incubator-devlake/server/services.runTaskStandalone
|
| /app/server/services/task_runner.go:114
|
| github.com/apache/incubator-devlake/server/services.RunTasksStandalone.func1
|
| /app/server/services/task.go:187
|
| runtime.goexit
|
| /usr/local/go/src/runtime/asm_amd64.s:1598
| Wraps: (6) handle failed in graphql collector
| Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.withPrefix (5) *withstack.withStack (6) *errutil.leafError Error types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *withstack.withStack (4) *errutil.leafError
I was able to get PR collection to work by rebuilding the backend with page size of 1.
As there is already some retry logic, would it be possible to reduce page size on each retry until 502 errors vanish?
Good catch. Changing the page size dynamically is too complicated and hard to maintain IMO. How about exposing an Environment Variable for changing the page size?
This would already be an improvement. It would still affect all projects, even the ones that could handle much bigger page sizes (all other 6 projects with different repositories worked fine with a page size of 10).
This issue has been automatically marked as stale because it has been inactive for 60 days. It will be closed in next 7 days if no further activity occurs.
This issue has been closed because it has been inactive for a long time. You can reopen it if you encounter the similar problem in the future.