srt icon indicating copy to clipboard operation
srt copied to clipboard

Roadmap: Connection Bonding

Open maxsharabayko opened this issue 6 years ago • 9 comments

Listener Callback: Group Extension (API)

Existing listener callback does not provide fields to pass group-related data. Another function is needed.

  • [x] Add SRTO_GROUPTYPE socket option to retrieve group type inside the listener callback. PR #1294
  • ✖️ New listener callback - postponed (for v1.5.1?).

Rejection Reason Enahancements (API)

Make it possible for an application to set a specific reason for rejecting a connection from a listener callback.

  • [x] PR #1194 - added rejection reason: timeout. Documentation changes are requested.

  • [x] PR #1291 - Enhanced and customizable reject reason - Requires an update to latest master

  • [x] PR #1300 Add definitions of predefined HTTP status codes used as rejection reason codes. (Tasks: @maxsharabayko to review)

Group Statistics

Reusing srt_bistats function for SRT socket group.

  • [x] srt_bistats function for groups: Added group support for stats probing. PR #1303.

  • [ ] srt_bistats function for groups: reuse pktSentTotal, pktSndLossTotal etc. as a sum of corresponding values of the individual socket in a group.

  • [ ] Decide on pktRcvDiscardTotal for groups

Group Status

  • [x] PR #1222 - srt_group_data(..) API function to return group size [ralated to the bonding API.

  • [x] PR #1326 - Add link state field to SRT_GROUP_DATA, and retrieve the value via srt_group_data(..). Possible link states: active/unstable/Idle/broken.

  • [ ] TODO

Uncategorized

  • [x] Issue #1245 Group Membership Handshake extension - PR #1285

  • [ ] Set latency per socket in a group (and probably some other options like MTU size).

    • requires: design of the idea (Tasks: @maxsharabayko to provide description)
  • [x] PR #1283 [core] Added unique send/receive stats - merged

  • [x] PR #1263 [docs] Added unique send/receive stats - (discussing undecrypted packets)

  • [ ] PR #1257 Balancing group implementation - extract only API-related changes (?) (extraction already done, PR describes extra changes for balancing groups)

  • [ ] SRT Handshake extension for socket groups - add docs.

  • [x] PR #1314 - Added a possibility to install a per-connection socket options, mainly required for supporting SRTO_BINDTODEVICE option

Balancing

  • [ ] 12. Load balancing implementation

  • [ ] 13. Load balancing documentation

Completed work
  • [x] 1. INTERNALS: #1109
  • The CUDTGroup class
  • Associated symbols the class depends on
  • Support functions in CUDT class for creation/deletion and hookup
  • (transmission functions temporarily deleted)
  • [x] 2. SRT API: #1115
  • Handling group ID in existing API functions
  • Extra API functions dedicated for groups
  • [X] 2.1. SRT API docs PR #1123

  • [x] 2.3. Added short doc about bonding PR #1117

  • [x] 3. HANDSHAKE AND MANAGEMENT PR #1119

  • Passing and recognizing the group in the handshake
  • Support functions in other classes to handle group synchronization
  • Doing specific actions as required for the group in existing internals
  • [x] 4. TRANSMISSION PR #1124
  • The receiving function, universal for broadcast and backup type
  • The sending function for broadcast type only.
  • [x] 4.1. Changed Socket Group API PR #1137

  • [X] 5. APPLICATIONS PR #1139

  • Handling the special syntax for groupwise connections
  • Handling the groups internally in the application
  • [x] 6. Added test examples for bonding PR #1143.

  • [x] 7. Redesigned SRTO_GROUPCONNECT option PR #1150

  • [x] 8. Added accept_bond function for blocking-mode multi-listener accept-waiting PR #1153

  • [x] 9. Refactoring before Backup groups (PR #1167)

  • [x] 10. Main backup implementation #1178

  • [x] 11. Main backup documentation #1168

maxsharabayko avatar Feb 06 '20 08:02 maxsharabayko

Hi, I have one question.

How will SRT_GTYPE_BALANCING work? will it just roundrobin or will have some kind of congestion control/packet loss count to send just the amount of packets that can handle?

EDIT: Really great work. Keep it up.

Llorx avatar Feb 28 '20 09:02 Llorx

Hi @Llorx Thanks for your interest in this feature. The first implementation of the balancing mode will likely be simple, maybe relying on some heuristics. The goal is to have an extensible module to allow people to add custom algorithms. In a way similar to how Congestion Control and Packet Filter modules are implemented.

maxsharabayko avatar Mar 02 '20 09:03 maxsharabayko

You can take a look at my development branch dev-groups-external-msgsync.

As I know that the method of selecting the link can be based on various parameters, I have provided a framework for adding various possible balancing algorithms. Currently there are two provided:

  • linkSelect_plain: simple round-robin with no parameters tracked
  • linkSelect_window: the selector tracks the cost of sending of a packet over particular link by checking the flight window on all links and deciding the link load basing on it. Every sending operation is considered as adding a load on this link, and for next sending the least currently loaded link is selected. This algorithm is provided, but didn't undergo yet any stress testing and will likely need to be largely improved.

ethouris avatar Mar 02 '20 10:03 ethouris

Nice @ethouris. linkSelect_window seems pretty cool.

Hm @maxsharabayko, I see a problem with the modular thingy. Most of the users will benefit from SRT by using third-party software that has SRT already implemented, so is impossible to create your own module as you need to compile your own SRT with that module "inserted". One example is vMix.

Will be difficult to release it with some cool modules already built-in? so third-party software can benefit from it, as I don't expect them to build some cool modules on their own.

Or maybe add some kind of "dll" thingy too. I don't know.

Also, there are some MP-TCP congestion control algorithms that maybe you can benefit of.

By the way, I know that this is a long-way until we have something cool hehe. Nice work you have there.

Llorx avatar Mar 05 '20 12:03 Llorx

Hi, sorry for bothering. Just to know, are you already working on something regarding 12. Load balancing implementation? You have way more experience and I'll love to see what you come with before I start fiddling with it.

EDIT: Oh, I forgot about @ethouris dev-groups-external-msgsync branch! Going to play with it a bit :-P

Llorx avatar Sep 16 '20 16:09 Llorx

Just note that we decided to put it aside because it cannot support redundancy protection. Simply, if you break one link, you'll get packet drops.

ethouris avatar Sep 16 '20 18:09 ethouris

Oh. Just fiddled a bit @ethouris. Copied your code to the latest master and I'm trying to make it work.

When a link goes down, is not possible to put its non-acked packets in the queue again (or resend them through the group or whatever)? I have the left arm in a cast so is really hard for me these days to work fluidly, will try my best.

Llorx avatar Sep 16 '20 19:09 Llorx

Excellent feature!
By contrast Broadcast and Main/Backup, I believe that the balnacing mode will be more and extermely useful, as long as the link selection(control data sent on each link) and ACK(may across another channel) carefully handled. FEC and interleaving will be helpful, too, on the other hand. It depends how these algorithms integrated together in implementation.

Oh. Just fiddled a bit @ethouris. Copied your code to the latest master and I'm trying to make it work.

When a link goes down, is not possible to put its non-acked packets in the queue again (or resend them through the group or whatever)? I have the left arm in a cast so is really hard for me these days to work fluidly, will try my best.

It definitely possible and useful! I have done similar things before that worked well, but details need to be carefully handled to resend in time but not too early.

I can't find dev-groups-external-msgsync anymore. How's progress of this feature(balance mode)?

wangyoucao577 avatar Jun 14 '22 11:06 wangyoucao577

This is the branch on my own clone of the repository, not in the main repository.

I don't know how "mergeable" it is now against the current code, I've not been updating it at all since the decision was to put this feature aside.

This implementation had a disadvantage that "zipping" was only possible basing on message numbers, so packets lost due to a broken link are lost forever. This part will definitely have to be rewritten in any new implementation.

ethouris avatar Jun 14 '22 12:06 ethouris