Error resolving broadcast hostname being swallowed
Describe the bug
We had a deployment of opal-server that we'd typoed the broadcast URI. We were seeing websockets disconnect errors in the clients, but no errors server side, so no clues it was the broadcast URI.
After turning debug logging on for the server, I spotted the following output:
2024-12-04T15:27:01.018085+0000| fastapi_websocket_rpc.websocket_rpc_e...| INFO | Client connected
2024-12-04T15:27:01.018500+0000| websockets.legacy.server | INFO | connection open
2024-12-04T15:27:01.022749+0000| fastapi_websocket_rpc.rpc_channel |DEBUG | Handling RPC request - {'request': RpcRequest(method='_ping_', arguments={}, call_id='5d8d421e3dc94751a035b202644c8e8a'), 'channel': '72c2a547ffc64741bd095079bc778d7d'}
2024-12-04T15:27:01.023420+0000| fastapi_websocket_pubsub.pub_sub_server |DEBUG | task is done: {<Task finished name='Task-3053' coro=<EventBroadcaster.__read_notifications__() done, defined at /usr/local/lib/python3.10/site-packages/fastapi_websocket_pubsub/event_broadcaster.py:245> exception=gaierror(-2, 'Name or service not known')>}
2024-12-04T15:27:01.023565+0000| fastapi_websocket_pubsub.event_broadc...| INFO | Cancelling broadcast listen task
2024-12-04T15:27:01.023677+0000| fastapi_websocket_pubsub.event_broadc...|DEBUG | Unsubscribing from ALL TOPICS
2024-12-04T15:27:01.023790+0000| fastapi_websocket_pubsub.event_notifier |DEBUG | Removing Subscription of topic='__EventNotifier_ALL_TOPICS__' for subscriber=cc43be7c3ebb41e1b4869ace10d213db
2024-12-04T15:27:01.023948+0000| fastapi_websocket_rpc.websocket_rpc_e...| INFO | Client connection failed - 42723 :: 72c2a547ffc64741bd095079bc778d7d
2024-12-04T15:27:01.024165+0000| fastapi_websocket_pubsub.pub_sub_server |DEBUG | Leaving endpoint's main loop
2024-12-04T15:27:01.026785+0000| websockets.legacy.server | INFO | connection closed
Based on the 4th line, I went to look at fastapi_websocket_pubsub/event_broadcaster.py:245 and saw it was referencing the broadcast URI, at which point I double checked ours and saw the typo.
What I believe to be a bug is that the error is being swallowed, and should probably be elevated to an ERROR level log for visibility.
To Reproduce Set the broadcast URI to an invalid value - in our case, a postgres URI with a hostname that didn't resolve.
Logs: se616-opal-opal-server-545c454db-hx2nc.log
Expected behavior A clear error level log to indicate that the broadcast URI could not be resolved
OPAL version
- Version: Client - 0.7.15, Server - 0.7.8
Thanks for reporting @keyz182 - this seems inline with https://github.com/permitio/opal/issues/711 Would you be open to contributing a fix with a PR?
/bounty 50
💎 $50 bounty • Permit.io
Steps to solve:
- Start working: Comment
/attempt #716with your implementation plan - Submit work: Create a pull request including
/claim #716in the PR body to claim the bounty - Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts
Thank you for contributing to permitio/opal!
Add a bounty • Share on socials
| Attempt | Started (GMT+0) | Solution |
|---|---|---|
| 🟢 @daveads | Dec 4, 2024, 7:45:18 PM | WIP |
| 🟢 @vikashsprem | Dec 4, 2024, 7:45:21 PM | WIP |
| 🟢 @TheVixhal | Dec 30, 2024, 10:12:28 AM | #728 |
/attempt #716
| Algora profile | Completed bounties | Tech | Active attempts | Options |
|---|---|---|---|---|
| @daveads | 2 Permitio bounties + 7 bounties from 4 projects |
Python, Shell, Go & more |
﹟12 |
Cancel attempt |
Thanks for reporting @keyz182 - this seems inline with #711 Would you be open to contributing a fix with a PR?
Ah, #711 is my co-worker :)
Looks like someone's already attempting it, so I'll give them a chance.
💡 @TheVixhal submitted a pull request that claims the bounty. You can visit your bounty board to reward.
Hey, @TheVixhal, we already assigned @daveads to this ticket
Hey, @TheVixhal, we already assigned @daveads to this ticket
Ooh, sorry😅, I didn’t check earlier. BTW, I learned a lot while fixing this bug. At first, I tried many things, but in the end, I realized the repo was using an older version of fastapi_websocket_pubsub. I felt like I should’ve checked that first. Let me know if there's anything else I can assist with.
can I try to solve this issue OR this issue is already assigned someone