opal icon indicating copy to clipboard operation
opal copied to clipboard

Error resolving broadcast hostname being swallowed

Open keyz182 opened this issue 1 year ago • 10 comments

Describe the bug

We had a deployment of opal-server that we'd typoed the broadcast URI. We were seeing websockets disconnect errors in the clients, but no errors server side, so no clues it was the broadcast URI.

After turning debug logging on for the server, I spotted the following output:

2024-12-04T15:27:01.018085+0000| fastapi_websocket_rpc.websocket_rpc_e...| INFO  | Client connected
2024-12-04T15:27:01.018500+0000| websockets.legacy.server                | INFO  | connection open
2024-12-04T15:27:01.022749+0000| fastapi_websocket_rpc.rpc_channel       |DEBUG  | Handling RPC request - {'request': RpcRequest(method='_ping_', arguments={}, call_id='5d8d421e3dc94751a035b202644c8e8a'), 'channel': '72c2a547ffc64741bd095079bc778d7d'}
2024-12-04T15:27:01.023420+0000| fastapi_websocket_pubsub.pub_sub_server |DEBUG  | task is done: {<Task finished name='Task-3053' coro=<EventBroadcaster.__read_notifications__() done, defined at /usr/local/lib/python3.10/site-packages/fastapi_websocket_pubsub/event_broadcaster.py:245> exception=gaierror(-2, 'Name or service not known')>}
2024-12-04T15:27:01.023565+0000| fastapi_websocket_pubsub.event_broadc...| INFO  | Cancelling broadcast listen task
2024-12-04T15:27:01.023677+0000| fastapi_websocket_pubsub.event_broadc...|DEBUG  | Unsubscribing from ALL TOPICS
2024-12-04T15:27:01.023790+0000| fastapi_websocket_pubsub.event_notifier |DEBUG  | Removing Subscription of topic='__EventNotifier_ALL_TOPICS__' for subscriber=cc43be7c3ebb41e1b4869ace10d213db
2024-12-04T15:27:01.023948+0000| fastapi_websocket_rpc.websocket_rpc_e...| INFO  | Client connection failed - 42723 :: 72c2a547ffc64741bd095079bc778d7d
2024-12-04T15:27:01.024165+0000| fastapi_websocket_pubsub.pub_sub_server |DEBUG  | Leaving endpoint's main loop
2024-12-04T15:27:01.026785+0000| websockets.legacy.server                | INFO  | connection closed

Based on the 4th line, I went to look at fastapi_websocket_pubsub/event_broadcaster.py:245 and saw it was referencing the broadcast URI, at which point I double checked ours and saw the typo.

What I believe to be a bug is that the error is being swallowed, and should probably be elevated to an ERROR level log for visibility.

To Reproduce Set the broadcast URI to an invalid value - in our case, a postgres URI with a hostname that didn't resolve.

Logs: se616-opal-opal-server-545c454db-hx2nc.log

Expected behavior A clear error level log to indicate that the broadcast URI could not be resolved

OPAL version

  • Version: Client - 0.7.15, Server - 0.7.8

keyz182 avatar Dec 04 '24 15:12 keyz182

Thanks for reporting @keyz182 - this seems inline with https://github.com/permitio/opal/issues/711 Would you be open to contributing a fix with a PR?

orweis avatar Dec 04 '24 18:12 orweis

/bounty 50

gemanor avatar Dec 04 '24 19:12 gemanor

💎 $50 bounty • Permit.io

Steps to solve:

  1. Start working: Comment /attempt #716 with your implementation plan
  2. Submit work: Create a pull request including /claim #716 in the PR body to claim the bounty
  3. Receive payment: 100% of the bounty is received 2-5 days post-reward. Make sure you are eligible for payouts

Thank you for contributing to permitio/opal!

Add a bountyShare on socials

Attempt Started (GMT+0) Solution
🟢 @daveads Dec 4, 2024, 7:45:18 PM WIP
🟢 @vikashsprem Dec 4, 2024, 7:45:21 PM WIP
🟢 @TheVixhal Dec 30, 2024, 10:12:28 AM #728

algora-pbc[bot] avatar Dec 04 '24 19:12 algora-pbc[bot]

/attempt #716

Algora profile Completed bounties Tech Active attempts Options
@daveads    2 Permitio bounties
+ 7 bounties from 4 projects
Python, Shell,
Go & more
﹟12
Cancel attempt

daveads avatar Dec 04 '24 19:12 daveads

Thanks for reporting @keyz182 - this seems inline with #711 Would you be open to contributing a fix with a PR?

Ah, #711 is my co-worker :)

Looks like someone's already attempting it, so I'll give them a chance.

keyz182 avatar Dec 04 '24 21:12 keyz182

/attempt #716

Options

TheVixhal avatar Dec 30 '24 10:12 TheVixhal

💡 @TheVixhal submitted a pull request that claims the bounty. You can visit your bounty board to reward.

algora-pbc[bot] avatar Dec 30 '24 10:12 algora-pbc[bot]

Hey, @TheVixhal, we already assigned @daveads to this ticket

gemanor avatar Dec 30 '24 11:12 gemanor

Hey, @TheVixhal, we already assigned @daveads to this ticket

Ooh, sorry😅, I didn’t check earlier. BTW, I learned a lot while fixing this bug. At first, I tried many things, but in the end, I realized the repo was using an older version of fastapi_websocket_pubsub. I felt like I should’ve checked that first. Let me know if there's anything else I can assist with.

TheVixhal avatar Dec 30 '24 11:12 TheVixhal

can I try to solve this issue OR this issue is already assigned someone

Harsh9485 avatar Apr 04 '25 08:04 Harsh9485