fix: flyteadmin doesn't shutdown servers gracefully
Tracking issue
Closes #6274
Why are the changes needed?
To handle SIGTERM properly, ensure cleanup of resources and provide smooth user experience
What changes were proposed in this pull request?
Implement graceful shutdown for flyteadmin servers
How was this patch tested?
Labels
Please add one or more of the following labels to categorize your PR:
- added: For new features.
- changed: For changes in existing functionality.
- deprecated: For soon-to-be-removed features.
- removed: For features being removed.
- fixed: For any bug fixed.
- security: In case of vulnerabilities
This is important to improve the readability of release notes.
Setup process
Screenshots
Check all the applicable boxes
- [ ] I updated the documentation accordingly.
- [ ] All new and existing tests passed.
- [x] All commits are signed-off.
Related PRs
Docs link
Summary by Bito
This PR implements graceful shutdown mechanisms for flyteadmin servers by introducing configurable timeout settings (replacing hardcoded 10-second values), adding signal handling and timeout-based contexts. It enhances error handling in gRPC server startup, updates configuration files with command-line flag support, and ensures proper termination of both HTTP and gRPC servers to prevent resource leaks and improve service reliability.Unit tests added: False
Estimated effort to review (1-5, lower is better): 1
Code Review Agent Run Status
- Limitations and other issues: ❌ Failure - The AI Code Review Agent skipped reviewing this change because it is configured to exclude certain pull requests based on the source/target branch or the pull request status. You can change the settings here, or contact the agent instance creator at [email protected].
Codecov Report
Attention: Patch coverage is 1.92308% with 51 lines in your changes missing coverage. Please review.
Project coverage is 58.64%. Comparing base (
588ceec) to head (9c26603). Report is 109 commits behind head on master.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| flyteadmin/pkg/server/service.go | 0.00% | 51 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## master #6289 +/- ##
==========================================
+ Coverage 50.37% 58.64% +8.27%
==========================================
Files 1170 938 -232
Lines 92851 71527 -21324
==========================================
- Hits 46774 41948 -4826
+ Misses 41931 26390 -15541
+ Partials 4146 3189 -957
| Flag | Coverage Δ | |
|---|---|---|
| unittests-datacatalog | 59.03% <ø> (+7.44%) |
:arrow_up: |
| unittests-flyteadmin | 56.11% <1.92%> (+4.24%) |
:arrow_up: |
| unittests-flytecopilot | 39.56% <ø> (+8.56%) |
:arrow_up: |
| unittests-flytectl | 64.72% <ø> (+2.39%) |
:arrow_up: |
| unittests-flyteidl | 76.12% <ø> (?) |
|
| unittests-flyteplugins | 61.14% <ø> (+7.01%) |
:arrow_up: |
| unittests-flytepropeller | 54.84% <ø> (+11.98%) |
:arrow_up: |
| unittests-flytestdlib | 64.04% <ø> (+8.70%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
@lowc1012 , is this ready for review?
Code Review Agent Run #55c786
Actionable Suggestions - 3
-
flyteadmin/pkg/server/service.go - 3
- Missing gRPC server graceful shutdown · Line 444-445
- Consider timeout for gRPC graceful shutdown · Line 448-450
- Potential race condition in server startup · Line 558-563
Filtered by Review Rules
Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.
-
flyteadmin/pkg/server/service.go - 2
- Error not propagated from shutdown · Line 574-575
- Improved error handling for HTTP server · Line 430-432
Review Details
-
Files reviewed - 1 · Commit Range:
490e289..a264c15- flyteadmin/pkg/server/service.go
-
Files skipped - 0
-
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
Changelist by Bito
This pull request implements the following key changes.
| Key Change | Files Impacted |
|---|---|
| Bug Fix - Bug Fix: Update configuration for graceful shutdown |
- - - |
| Bug Fix - Bug Fix: Implement graceful shutdown in server operations |
- |
Code Review Agent Run #e9c2cd
Actionable Suggestions - 0
Review Details
-
Files reviewed - 1 · Commit Range:
a264c15..d1714f6- flyteadmin/pkg/server/service.go
-
Files skipped - 0
-
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
Bito Usage Guide
Commands
Type the following command in the pull request comment and save the comment.
/review- Manually triggers a full AI review.
Refer to the documentation for additional commands.
Configuration
This repository uses code_review_bito You can customize the agent settings here or contact your Bito workspace admin at [email protected].
Documentation & Help
@eapolinario please help review it, thank you
@eapolinario please help review it, thank you
I was the original one that filed this issue for myself to take a look at later so I can review it a bit.
Part of my thinking when handling graceful shutdown was not just the servers but also draining any buffers, specifically for cloud events so that we minimize notifications that are dropped because they are still buffered in memory.
Part of my thinking when handling graceful shutdown was not just the servers but also draining any buffers, specifically for cloud events so that we minimize notifications that are dropped because they are still buffered in memory.
I think the buffers might have been in my company's custom fork but it would still be good to look for anything else that can be drained or shutdown via context after the servers are shutdown.
For example, there are a number of things that get kicked off in goroutines here liked the scheduled execution executor: https://github.com/lowc1012/flyte/blob/master/flyteadmin/pkg/rpc/adminservice/base.go
Code Review Agent Run #138eb1
Actionable Suggestions - 0
Review Details
-
Files reviewed - 1 · Commit Range:
d1714f6..cdb0383- flyteadmin/pkg/server/service.go
-
Files skipped - 0
-
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
Bito Usage Guide
Commands
Type the following command in the pull request comment and save the comment.
/review- Manually triggers a full AI review.
Refer to the documentation for additional commands.
Configuration
This repository uses code_review_bito You can customize the agent settings here or contact your Bito workspace admin at [email protected].
Documentation & Help
Code Review Agent Run #4c1dda
Actionable Suggestions - 0
Filtered by Review Rules
Bito filtered these suggestions based on rules created automatically for your feedback. Manage rules.
-
flyteadmin/pkg/server/service.go - 2
- Consider validating graceful shutdown timeout value · Line 578-578
- Consider validating the shutdown timeout value · Line 444-444
Review Details
-
Files reviewed - 4 · Commit Range:
cdb0383..6d76383- flyteadmin/pkg/config/config.go
- flyteadmin/pkg/config/serverconfig_flags.go
- flyteadmin/pkg/config/serverconfig_flags_test.go
- flyteadmin/pkg/server/service.go
-
Files skipped - 0
-
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
Bito Usage Guide
Commands
Type the following command in the pull request comment and save the comment.
/review- Manually triggers a full AI review.
Refer to the documentation for additional commands.
Configuration
This repository uses code_review_bito You can customize the agent settings here or contact your Bito workspace admin at [email protected].
Documentation & Help
@lowc1012 can you sign off your commits?
@lowc1012 can you sign off your commits?
done