raft Memory leak

when one killed node rejoins the cluster, one memory leak happens. The following is the sanitizer report:

1: raft_start(): io: load closed segment 0000000000000138-0000000000000201: entries batch 5 starting at byte 168: entries count in preamble is zero

==ERROR===: LeakSanitizer: detected memory leaks

Direct leak of 160 byte(s) in 1 object(s) allocated from:
#0 0x7f3777ab3c3e in __interceptor_realloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:163
#1 0x55f4037670b2 in extendEntries src/uv_segment.c:344
#2 0x25a361d1  (<unknown module>)
SUMMARY: AddressSanitizer: 160 byte(s) leaked in 1 allocation(s).

Could you have a check?

Jul 20 '22 07:07 jerrytesting

@MathieuBordere is off this week, he should be able to look into this next week.

Jul 20 '22 13:07 stgraber

@stgraber I'm actually also off next week.

Jul 22 '22 12:07 MathieuBordere

Oh right, oops :)

Jul 22 '22 12:07 stgraber

I've been looking into this. Obviously we hit the error branch here:

https://github.com/canonical/raft/blob/17ce02fa378f650fc6a02d21355c37e684da1167/src/uv_segment.c#L247-L251

That's in uvLoadEntriesBatch. Both uvSegmentLoadOpen and uvSegmentLoadClosed call this in a loop, doing extendEntries when necessary to reallocate memory. When we detect an error in the middle of this loop, we goto err without freeing the last allocation from extendEntries. Should be straightforward to fix, I'll submit a PR shortly.

The fact that that error occurs in the first place is maybe a different bug, unless the data on disk got corrupted somehow.

Aug 23 '22 21:08 cole-miller