zebra: FRR restart leads to zebra mlag core
Issue: With higher mroute scale (around 900), in PIM MLAG active-active setup crash was observed on a) restarting frr service on standby b) Enabling/Disabling pim active-active
Root Cause: During bulk delete event the message read from the socket was around for 500 mroutes. While decoding the protobuf message the stream size allocated was 32768 bytes. But after decoding the message for 500 mroute 34000 bytes are needed. So while adding the 482nd mroute to stream, we run out of the space. We already have a check in the loop which checks for the size, before writing every mroute. But the check was for the whole stream size allocated instead of correctly checking the remaining space in the stream via STREAM_WRITEABLE API. The change is made to check against the remaining space instead of checking against the actual size of the stream that has been allocated.
Testing : Tested with 900 mroute scale on the PIM MLAG AA setup with FRR restart. No crash is observed.
Ticket: #4633514
ci:rerun
ci:rerun
ci:rerun
ci:rerun
@Mergifyio backport stable/10.5 stable/10.4 stable/10.3 stable/10.2 stable/10.1 stable/10.0
backport stable/10.5 stable/10.4 stable/10.3 stable/10.2 stable/10.1 stable/10.0
🟠 Waiting for conditions to match
- [ ]
merged[📌 backport requirement]
ci:rerun