Wrong "sid" in subscriber candidates
How to use GitHub
- Please use the 👍 reaction to show that you are affected by the same issue.
- Please don't comment if you have no relevant information to add. It's just extra noise for everyone subscribed to this issue.
- Subscribe to receive notifications on status change and new comments.
A customer of the HPB service reported issues with some participants in a larger call that could not subscribe the streams of some publishers. After checking the logs of the signaling server, I noticed error messages about Candidate message sid (12345) does not match publisher sid (67890) coming from https://github.com/strukturag/nextcloud-spreed-signaling/blob/39ea15c9b3b7020792708cb184504d797f5cf8ef/mcu_janus_publisher.go#L229
The sid values in Talk are generated from the clients timestamp. In the log, the first timestamp matches the time when a subscriber was created in the affected session, the second timestamp matches the time when the publisher of the session was created, so it looks like the client sent the candidate for the subscriber stream with the sid of the own publisher.
User-Agent was Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:136.0) Gecko/20100101 Firefox/136.0, so it's a browser using the Talk web app.
Some things I noticed:
- The session left the room, started publishing and then joined the room. Shouldn't the publishing start after the room is joined?
- Between the publishing and the (failed) subscribing, there was a successful session resume.
Here are some relevant log messages of the signaling server:
...
Mar 10 20:01:51 sig-server nextcloud-spreed-signaling[32462]: clientsession.go:505: Session c2badVMZJVhHp3AKvgQWTJnIcRl94GdM7AZqrbSODAJ8PTBhdmo1REhwcXI2dl9MeUlYVDF6Yl9IcDFVT0ZWWC1YM094QnBzXzBwc0h3WjU4QlFjcmU4V2h8MDQyMzM2MTQ3MQ== left call 5uzk4jec
Mar 10 20:02:04 sig-server nextcloud-spreed-signaling[32462]: mcu_proxy.go:1172: Created video publisher 7627d347-41a9-470b-a812-0460ceef7487 on https://proxy1.domain.invalid/ for c2badVMZJVhHp3AKvgQWTJnIcRl94GdM7AZqrbSODAJ8PTBhdmo1REhwcXI2dl9MeUlYVDF6Yl9IcDFVT0ZWWC1YM094QnBzXzBwc0h3WjU4QlFjcmU4V2h8MDQyMzM2MTQ3MQ==
Mar 10 20:02:04 sig-server nextcloud-spreed-signaling[32462]: clientsession.go:946: Publishing video as 7627d347-41a9-470b-a812-0460ceef7487 for session c2badVMZJVhHp3AKvgQWTJnIcRl94GdM7AZqrbSODAJ8PTBhdmo1REhwcXI2dl9MeUlYVDF6Yl9IcDFVT0ZWWC1YM094QnBzXzBwc0h3WjU4QlFjcmU4V2h8MDQyMzM2MTQ3MQ==
Mar 10 20:02:04 sig-server nextcloud-spreed-signaling[32462]: room.go:789: Session c2badVMZJVhHp3AKvgQWTJnIcRl94GdM7AZqrbSODAJ8PTBhdmo1REhwcXI2dl9MeUlYVDF6Yl9IcDFVT0ZWWC1YM094QnBzXzBwc0h3WjU4QlFjcmU4V2h8MDQyMzM2MTQ3MQ== joined call 5uzk4jec
...
Mar 10 20:35:33 sig-server nextcloud-spreed-signaling[32462]: hub.go:1211: Resume session from 1.2.3.4 in DE (Mozilla/
5.0 (Windows NT 10.0; Win64; x64; rv:136.0) Gecko/20100101 Firefox/136.0) c2badVMZJVhHp3AKvgQWTJnIcRl94GdM7AZqrbSODAJ8PTB
hdmo1REhwcXI2dl9MeUlYVDF6Yl9IcDFVT0ZWWC1YM094QnBzXzBwc0h3WjU4QlFjcmU4V2h8MDQyMzM2MTQ3MQ== (private=resume-session-id)
...
Mar 10 20:35:34 sig-server nextcloud-spreed-signaling[32462]: clientsession.go:1024: Subscribing video from rtZ4nuxmNBQxoQyibWySkDwHnE0aeveVkPZOa2Ptm1x8PThnMU9RWE4xZWtKeGMwR2RWeWMzU25ZRC1NN3lyVFAyLThucFlLQ3JTR2QzN0xGUzZJci1Db2F8MzMzNTM2MTQ3MQ== as ee0de55d-85b6-45fa-9ce2-e89e8f645124 in session c2badVMZJVhHp3AKvgQWTJnIcRl94GdM7AZqrbSODAJ8PTBhdmo1REhwcXI2dl9MeUlYVDF6Yl9IcDFVT0ZWWC1YM094QnBzXzBwc0h3WjU4QlFjcmU4V2h8MDQyMzM2MTQ3MQ==
...
more subscribers to other sessions from c2badV...
...
Mar 10 20:35:35 sig-server nextcloud-spreed-signaling[32462]: hub.go:2673: Could not send MCU message &{Type:candidate Sid:1741635334707 RoomType:video Payload:map[candidate:map[candidate:candidate:0 1 UDP 2122252543 192.168.0.101 51661 typ host sdpMLineIndex:0 sdpMid:0]] Bitrate:0 AudioCodec: VideoCodec: VP9Profile: H264Profile: offerSdp:<nil> answerSdp:<nil>} for session c2badVMZJVhHp3AKvgQWTJnIcRl94GdM7AZqrbSODAJ8PTBhdmo1REhwcXI2dl9MeUlYVDF6Yl9IcDFVT0ZWWC1YM094QnBzXzBwc0h3WjU4QlFjcmU4V2h8MDQyMzM2MTQ3MQ== to c2badVMZJVhHp3AKvgQWTJnIcRl94GdM7AZqrbSODAJ8PTBhdmo1REhwcXI2dl9MeUlYVDF6Yl9IcDFVT0ZWWC1YM094QnBzXzBwc0h3WjU4QlFjcmU4V2h8MDQyMzM2MTQ3MQ==: Candidate message sid (1741635334707) does not match publisher sid (1741633324390)
...
the same error for more candidates sharing the same IP addresses but different ports, so probably coming from different peer connections
...
1741635334707 -> Mar 10 20:35:34
1741633324390 -> Mar 10 20:02:04
The check in the signaling server was introduced here: https://github.com/strukturag/nextcloud-spreed-signaling/commit/e86baba54d60f3b7a172f8bbf7e2e110c519e4a0
@danxuliu do you have an idea what might change the sid for the subscriber and set it to the value of the publisher / use the publisher sid for the subscriber candidates? The candidates that late (more than 30 minutes after publishing started) are surely for the subscriber and thus should have that sid.
In general I think there should be some error handling for cases like this. If the candidate is rejected by the signaling server, an error response is sent back but is probably not handled by the client (e.g. by recreating the peer and re-subscribing).