[relay-server] Add health-check agent recognition to avoid error logs
Health-check connections now send a properly formatted auth message with a well-known peer ID instead of immediately closing. The server recognizes this peer ID and handles the connection gracefully with a debug log instead of error logs.
Changes:
- Add relay/healthcheck/peerid package with health-check peer ID and dummy auth token
- Update health-check client to send auth message before closing
- Return peer ID from handshake even on validation failure
- Check for health-check agent in relay Accept() to avoid error logs
Describe your changes
Issue ticket number and link
Stack
Checklist
- [x] Is it a bug fix
- [ ] Is a typo/documentation fix
- [ ] Is a feature enhancement
- [ ] It is a refactor
- [ ] Created tests that fail without the change (if possible)
By submitting this pull request, you confirm that you have read and agree to the terms of the Contributor License Agreement.
Documentation
Select exactly one:
- [ ] I added/updated documentation for this change
- [x] Documentation is not needed for this change (explain why)
Docs PR URL (required if "docs added" is checked)
Paste the PR link from https://github.com/netbirdio/docs here:
https://github.com/netbirdio/docs/pull/__
Summary by CodeRabbit
-
New Features
- Implemented specialized health check functionality for relay connections with dedicated identification and authentication mechanisms.
- Enhanced WebSocket connection initialization with improved authentication message handling.
-
Bug Fixes
- Improved relay connection handshake error handling to preserve and return more accurate peer identification diagnostics during failures.
✏️ Tip: You can customize this high-level summary in your review settings.
Walkthrough
This PR introduces health check support for the relay server by adding peer identification, authentication tokens, and conditional error logging for health check connections across the WebSocket and handshake layers.
Changes
| Cohort / File(s) | Summary |
|---|---|
Health Check Infrastructure relay/healthcheck/peerid/peerid.go |
New module introducing HealthCheckPeerID (precomputed PeerID hash), DummyAuthToken (structurally valid auth token), createDummyToken helper (HMAC-SHA256 token with zeroed signature and "healthcheck" payload), and IsHealthCheck() for peer identification. |
Health Check Integration relay/healthcheck/ws.go, relay/server/handshake.go, relay/server/relay.go |
Modified WebSocket initialization to send authentication message immediately after connection; updated handshake return signatures to include peerID on errors; added health-check conditional logging in relay accept logic to distinguish health-check failures from regular errors. |
Estimated code review effort
🎯 3 (Moderate) | ⏱️ ~20 minutes
- Key areas requiring attention:
- Verify HealthCheckPeerID computation correctness and immutability
- Confirm DummyAuthToken is structurally valid but not cryptographically verified
- Ensure auth message marshaling/writing errors are handled without leaving dangling connections
- Validate that modified handshake return signatures maintain backward compatibility with error handling paths
- Check that health-check conditional in relay.Accept doesn't mask legitimate errors
Suggested reviewers
- mlsmaycon
Poem
🐰 A health check hops through the relay gate, With dummy tokens to authenticate, PeerIDs precomputed, swift and true, WebSockets greet them with messages anew, Debug logs whisper when health checks arrive!
Pre-merge checks and finishing touches
❌ Failed checks (1 warning)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | ⚠️ Warning | Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. | You can run @coderabbitai generate docstrings to improve docstring coverage. |
✅ Passed checks (2 passed)
| Check name | Status | Explanation |
|---|---|---|
| Title check | ✅ Passed | The title accurately and concisely summarizes the main change: adding health-check agent recognition to prevent error logs in the relay server. |
| Description check | ✅ Passed | The description follows the template structure with completed sections: changes summary, bug fix checklist selection, and documentation rationale. All critical sections are filled. |
✨ Finishing touches
- [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
- [ ] Create PR with unit tests
- [ ] Post copyable unit tests in a comment
- [ ] Commit unit tests in branch
fix/handle-healthcheck-errs
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
Quality Gate passed
Issues
0 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code