RawRabbit icon indicating copy to clipboard operation
RawRabbit copied to clipboard

RawRabbit has solvable memory leaks (ex. PublishAcknowledgeMiddleware)

Open tristan-hyams opened this issue 7 years ago • 4 comments

Really any collection reference to IModel is a potential memory leak.

When using transient channels (such as IModel) in Dictionaries or ConcurrentDictionaries as this library does you need to implement a background process that locks or slims the dictionaries to remove keys of dead IModels.

For example:

public static Dictionary<IModel, int> models3 = new Dictionary<IModel, int>();
public static async Task CleanupDictionariesWithSemaphoreAsync(TimeSpan timeSpan)
{
    while (true)
    {
        await Task.Delay(timeSpan);
        await slimShady3.WaitAsync();

        var count = 0;
        var listOfItemsToRemove = models3.Where(x => x.Key.IsClosed).ToArray();
        foreach (var kvp in listOfItemsToRemove)
        {
            models3.Remove(kvp.Key);
            count++;
        }
        await Console.Out.WriteLineAsync($"Dead channels removed: {count}");

        slimShady3.Release();
    }
}

These models are forever inside the Dictionary (used as keys) and are never allowed to be reclaimed by GC even if the IModel is later disposed. This isn't a hot approach as we have discovered several memory leaks regarding this pipeline. We have since implemented a ReplacementMiddleware that has something akin to the above running to clean up ChannelLocks etc.

I would also toss in utilizing a compiled in 4.7.2 and C# 7.2 version or RabbitMQ client 5.x. It performs a heck of lot better than 4.5.1.

I have created a repo demonstration of this memory leak and how I implemented the above code to prevent threads colliding with the dictionaries.

https://github.com/thyams/CookedRabbit

tristan-hyams avatar Jun 09 '18 17:06 tristan-hyams

Hi @thyams - thanks a lot for you deep analyze! I would be more than happy to review a PR (or perhaps two PRs in this case) with proposed fixes for this.

pardahlman avatar Jun 11 '18 05:06 pardahlman

No problem, when I get some time, I will post replacement channel middleware and you can critique it with me.

tristan-hyams avatar Jun 15 '18 13:06 tristan-hyams

@thyams could this be related to not handling events from the Default/Basic event consumers such as Consumer Cancel Notification ?

In that issue I describe where the server has send consumer cancel but RawRabbit is not attaching/handling it and thus the IModel will stay forever. In the case of being a subscriber, it will never get removed either it appears.

RedOnePrime avatar Aug 07 '18 19:08 RedOnePrime

There was a plethora of issues, even my original solution faulted over time.

Basically what it comes down to, no matter how clever you get with TaskCompletionSource, Channel.IsOpen, PingPong tests, there is no sure fire way of ensuring that your succesfully awaited things like AutoRecovery if connection is IRecoverable. There are too many moving components.

Furthermore, if a Task timesout or is cancelled in the background, you miss it unless you have a global exception handler for UnobserevedTaskExceptions.

Suffice to say, the rabbitmq dotnetclient isn't my favorite library. RawRabbit does a pretty decent job of mitigating most issues, but as seen in my environment Transient network outages often lead to catastrophic service issues.

The real solution is two parts: Search: First Identify Models that are closed, tag them with a timestamp. Destroy: After timespan, assume channel is not coming back alive and remove it from dictionary sources.

That solves the memory leaks. My solution interferes with recovery, so the first step is to search for all IsOpen false Channels and flag them for removal, a second task removes every channel from the dictionary after x timeframe.

I intended to write up a PR to create such a Search & Destroy system but I have been busy with my own RabbitMQ library (https://github.com/thyams/CookedRabbit) and work.

tristan-hyams avatar Aug 07 '18 20:08 tristan-hyams