redis-plus-plus icon indicating copy to clipboard operation
redis-plus-plus copied to clipboard

[BUG?]memory accumulation causes oom

Open stxrlong opened this issue 3 months ago • 2 comments

my program always has an OOM problem during a specific period of time. After a simple research, I found that the IO of write was very large during this period, which concurrency could reach 80k-100k QPS, and the memory usage increased, of course, the problem will restore after the async redis object was released. I used the async feature of the redis++, which is a nice lib and very easy to use. Because of the above issue, I spent some time studying the source code of redis++ and hiredis. I didn't find the root cause of memory accumulation, but I found two points in redis++, among which one point has greatly alleviated the memory stacking; I used some memleak detection tools, such as valgrind/asan, but they all prevented the high concorrency

  1. we can release the formatted command after calling the hiredis async func where has no error occurred, because the hiredis has copied the value. this point does alleviate the memory stacking
void _handle(redisAsyncContext &ctx, HiredisAsyncCallback callback) {
        if (redisAsyncFormattedCommand(&ctx,
                    callback, this, _cmd.data(), _cmd.size()) != REDIS_OK) {
            throw_error(ctx.c, "failed to send command");
        }

        // release the cmd after calling the above func to reduce the peak usage of memory
        FormattedCommand cmd = std::move(_cmd);
}
  1. release the context when the hiredis do calling the disconnectcallback func
void AsyncConnection::_fail_events(std::exception_ptr err) {
    // we'd better release the context
    if(_ctx) {
        redisAsyncFree(_ctx);
        _ctx = nullptr;
    }

    _err = err;

    _state = State::BROKEN;

    // Must call _clean_up after `_err` has been set.
    _clean_up();
}

this is no longer a unique ptr, this point comes from _ctx = ctx.release();, therefore, the deletation struct is useless However, when we call func _fail_events, hiredis may have already released the context by calling func redisAsyncDisconnect;

  1. I finally think the memory accumulation may caused by reconnect logic when redis++ actively discovers redis connection issues when sending commands

stxrlong avatar Mar 30 '24 13:03 stxrlong

Thanks for your reporting!

I finally think the memory accumulation may caused by reconnect logic when redis++ actively discovers redis connection issues when sending commands

Did you have a network issue, or some special configuration, so that Redis connection was broken from time to time? If the network is healthy, do you still have the OOM problem?

Sorry for the late reply. Too busy these days...

Regards

sewenew avatar Apr 08 '24 14:04 sewenew

u r so kind, yeah, that's right, i wrote a test case that runs above one thread using the command [zadd]. when the concurrency comes to 20k, the memory reaches 800M instantly. if there is no exception thrown, the memory will return to normal(about 14M). and if we receive exception, the memory will stay at 300M after we get all the responses, the approximate test case is as follows:

#include "sw/redis++/async_redis.h"
#include <thread>

int main(int argc, char**argv){
    int qps = 20000;
    std::string key = "sortedkey";

    sw::redis::AsyncRedis redis;
    ... // connnection
    std::thread t([qps, key, &redis]{
        std::atomic_int32_t res = 0, excpt = 0;
        std::string v;
        for(auto i=0; i<qps; ++i){
            v += std::to_string(i);
            redis.zadd(key, v, i).then(boost::lanuch::sync, [](auto&&fut){
                ++res;
                try{
                    fut.get()
                }catch(...){
                    ++excpt;
                }
            });
        }

        while(true){
            if(res== qps)
                break; 
            std::this_thread::sleep_for(std::chrono::seconds(1);
        }
    });

    ... // thread join
    ... // while true
}

i'm sorry about not find the root cause of this problem, i will study it again during the may day holiday

stxrlong avatar Apr 13 '24 05:04 stxrlong