libtorrent memory leak when calling handle.torrent_file() from read_piece

libtorrent version (or branch): 1.2.18

platform/architecture: x86_64 centos7.2

compiler and compiler version: gcc7.3.1

I add big file into lt session, and the file checking will last for a while. During the file checking, if I call torrent_info() from read_piece_alert's handle, there will be some memory leak.

With the following code can easily reproduce the case:

#include <functional>
#include <future>
#include <iostream>
#include <memory>

// lib torrent
#include "libtorrent/alert_types.hpp"
#include "libtorrent/create_torrent.hpp"
#include "libtorrent/session.hpp"
#include "libtorrent/settings_pack.hpp"
#include "libtorrent/socket.hpp"
#include "libtorrent/torrent_flags.hpp"
#include "libtorrent/torrent_handle.hpp"
#include "libtorrent/torrent_info.hpp"
#include "libtorrent/torrent_status.hpp"

const lt::alert_category_t alertMask = lt::alert_category::status |
                                       lt::alert_category::file_progress |
                                       lt::alert_category::piece_progress |
                                       lt::alert_category::incoming_request |
                                       lt::alert_category::performance_warning |
                                       lt::alert_category::tracker |
                                       lt::alert_category::connect |
                                       lt::alert_category::peer;

void alert_handler(lt::session* s)
{
    while (true)
    {
        std::vector<lt::alert*> alerts;
        s->pop_alerts(&alerts);

        for (const auto alert : alerts)
        {
            switch (alert->type())
            {
            case lt::add_torrent_alert::alert_type:
            {
                std::cout << "torrent added" << std::endl;
                break;
            }
            case lt::read_piece_alert::alert_type:
            {
                lt::alert_cast<lt::read_piece_alert>(alert)->handle.torrent_file();
                break;
            }
            case lt::piece_finished_alert::alert_type:
            {
                lt::piece_finished_alert* a = lt::alert_cast<lt::piece_finished_alert>(alert);
                a->handle.read_piece(a->piece_index);
                break;
            }
            default:
                break;
            }
        }

        usleep(100000);
    }
}

int main()
{
    lt::settings_pack sp;
    sp.set_str(lt::settings_pack::listen_interfaces, "127.0.0.1:60020");
    sp.set_int(lt::settings_pack::alert_mask, alertMask);
    sp.set_int(lt::settings_pack::aio_threads, 16);
    sp.set_int(lt::settings_pack::alert_queue_size, 10000);

    auto s = lt::session(sp);
    sleep(1);

    auto f = std::async(std::launch::async, std::bind(&alert_handler, &s));
    sleep(1);

    lt::file_storage fs;
    lt::add_files(fs, "./testfile");
    lt::create_torrent torrent(fs);
    lt::set_piece_hashes(torrent, lt::parent_path("."));

    std::vector<char> tmpBuf;
    lt::bencode(std::back_inserter(tmpBuf), torrent.generate());

    lt::add_torrent_params p;
    p.ti = std::make_shared<lt::torrent_info>(tmpBuf.data(), tmpBuf.size());
    p.save_path = ".";
    p.flags = lt::torrent_flags::update_subscribe |
              lt::torrent_flags::need_save_resume;
    auto handle = s.add_torrent(p);
    sleep(1);

    // wait for the user to end
    char a;
    int ret = std::scanf("%c\n", &a);
    (void)ret; // ignore
    return 0;
}

This code is just:

create a torrent from ./testfile, add it into session.
when piece_finish_alert occurs, do read_piece()
when read_piece_alert occurs, get torrent_info() from handle.

I will create a big file as testfile, such as:

truncate -s 50G testfile

then just run the test code, the memory leaks occur. But after file checking, the memory not leak any more when I keep calling read_piece().

Jan 17 '23 08:01 Lwxiang

Do you know which object is leaking?

Jan 17 '23 09:01 arvidn

I haven't found the leaking object, besides torrent_file(), call handle.status() from read_piece_alert also causes leaking.

Jan 18 '23 02:01 Lwxiang

@Lwxiang How do you interpret "memory leak"?

I haven't found the leaking object, besides torrent_file(), call handle.status() from read_piece_alert also causes leaking.

So each call to torrent_file() within the loop allocates some additional memory and never frees it? So if you (for example) will purposely skip this call at some iterations it allocates less memory?

Jan 18 '23 06:01 glassez

@glassez

So each call to torrent_file() within the loop allocates some additional memory and never frees it?

yes

So if you (for example) will purposely skip this call at some iterations it allocates less memory?

With the test code above, if I add a 50GB file, which has 12800 pieces, 4096KB per piece, then call torrent_file() about 2000 times(imprecisely) cause the memory usage at about 20.4GB RES(after all piece is finished and there are not read alerts any more and the memory never be released) while if I call torrent_file() about 8000 times cause the memory usage at 37.1GB RES

the calling times is not very precise, but you can see more torrent_file() I call, more memories it will remain after all actions.

Jan 19 '23 02:01 Lwxiang

you can see more torrent_file() I call, more memories it will remain after all actions.

It looks strange, considering that torrent_handle::torrent_file() isn't supposed to make copy of the data, but only returns a shared pointer to existing one.

Jan 19 '23 04:01 glassez

With the test code above, if I add a 50GB file, which has 12800 pieces, 4096KB per piece, then call torrent_file() about 2000 times(imprecisely) cause the memory usage at about 20.4GB RES

Could you provide a modified code (with limiting calls to torrent_file()) that you used to test it?

Jan 19 '23 04:01 glassez

Could you provide a modified code (with limiting calls to torrent_file()) that you used to test it?

@glassez the limit number can be changed by max_count

#include <atomic>
#include <functional>
#include <future>
#include <iostream>
#include <memory>
#include <mutex>

// lib torrent
#include "libtorrent/alert_types.hpp"
#include "libtorrent/create_torrent.hpp"
#include "libtorrent/session.hpp"
#include "libtorrent/settings_pack.hpp"
#include "libtorrent/socket.hpp"
#include "libtorrent/torrent_flags.hpp"
#include "libtorrent/torrent_handle.hpp"
#include "libtorrent/torrent_info.hpp"
#include "libtorrent/torrent_status.hpp"

const lt::alert_category_t alertMask = lt::alert_category::status |
                                       lt::alert_category::file_progress |
                                       lt::alert_category::piece_progress |
                                       lt::alert_category::incoming_request |
                                       lt::alert_category::performance_warning |
                                       lt::alert_category::tracker |
                                       lt::alert_category::connect |
                                       lt::alert_category::peer;

int count = 0;
int max_count = 2000;
std::mutex mutex;

void alert_handler(lt::session* s)
{
    while (true)
    {
        std::vector<lt::alert*> alerts;
        s->pop_alerts(&alerts);

        for (const auto alert : alerts)
        {
            switch (alert->type())
            {
            case lt::add_torrent_alert::alert_type:
            {
                std::cout << "torrent added" << std::endl;
                break;
            }
            case lt::read_piece_alert::alert_type:
            {
                mutex.lock();
                int current = ++count;
                mutex.unlock();

                if (current >= max_count)
                {
                    break;
                }

                lt::alert_cast<lt::read_piece_alert>(alert)->handle.torrent_file();
                break;
            }
            case lt::piece_finished_alert::alert_type:
            {
                lt::piece_finished_alert* a = lt::alert_cast<lt::piece_finished_alert>(alert);
                a->handle.read_piece(a->piece_index);
                break;
            }
            default:
                break;
            }
        }

        usleep(100000);
    }
}

int main()
{
    lt::settings_pack sp;
    sp.set_str(lt::settings_pack::listen_interfaces, "127.0.0.1:60020");
    sp.set_int(lt::settings_pack::alert_mask, alertMask);
    sp.set_int(lt::settings_pack::aio_threads, 16);
    sp.set_int(lt::settings_pack::alert_queue_size, 10000);

    auto s = lt::session(sp);
    sleep(1);

    auto f = std::async(std::launch::async, std::bind(&alert_handler, &s));
    sleep(1);

    lt::file_storage fs;
    lt::add_files(fs, "./testfile");
    lt::create_torrent torrent(fs);
    lt::set_piece_hashes(torrent, lt::parent_path("."));

    std::vector<char> tmpBuf;
    lt::bencode(std::back_inserter(tmpBuf), torrent.generate());

    lt::add_torrent_params p;
    p.ti = std::make_shared<lt::torrent_info>(tmpBuf.data(), tmpBuf.size());
    p.save_path = ".";
    p.flags = lt::torrent_flags::update_subscribe |
              lt::torrent_flags::need_save_resume;
    auto handle = s.add_torrent(p);
    sleep(1);

    // wait for the user to end
    char a;
    int ret = std::scanf("%c\n", &a);
    (void)ret; // ignore
    return 0;
}

Jan 19 '23 07:01 Lwxiang

I tested this a few more times and found that the amount of memory leaked does not necessarily seem to be positively correlated with the number of calls. Maybe there are something wrong in my test code. But it is true that there is some memory that cannot be released after calling.

Jan 19 '23 07:01 Lwxiang

Maybe it's read_piece() that causes the problem?..

Jan 19 '23 11:01 glassez

I strongly suspect that the memory being used is because of the (unbounded) number of calls to read_piece(). That call will cause libtorrent to allocate the size of one piece on the heap, copy that piece data into it and pass it back to you in an alert. Since you don't limit the number of simultaneous read_piece_alerts in the queue (i.e. it's unbounded) it's all a race to destruct those alert objects and free the piece memory. The faster you can free the alerts, the fewer of them will exist simultaneously.

i.e. you could most likely put any code there (e.g. sleeping for 1ms) and you'll see the memory usage increase.

Jan 19 '23 12:01 arvidn

Since you don't limit the number of simultaneous read_piece_alerts in the queue (i.e. it's unbounded) it's all a race to destruct those alert objects and free the piece memory. The faster you can free the alerts, the fewer of them will exist simultaneously.

But doesn't @Lwxiang claim that the allocated memory remains unallocated even after all read_piece_alerts are processed (and supposed to be freed)? Or did I misinterpret it?

Jan 19 '23 15:01 glassez

But doesn't @Lwxiang claim that the allocated memory remains unallocated even after all read_piece_alerts are processed (and supposed to be freed)? Or did I misinterpret it?

yes, if the memories are just allocated and will be freed later, it is fine. But after all read_piece_alerts processed, the memories remain and never be freed.

Jan 23 '23 14:01 Lwxiang

This is not the memory usage during processing, it is after all read_piece_alerts processed, and wait about half an hour.

I am pretty sure that the memory leaks, but I don't know why.

Jan 23 '23 15:01 Lwxiang

can you build with address sanitizer?

Jan 23 '23 16:01 arvidn

can you build with address sanitizer?

I have tried, no leak error shows

Jan 29 '23 03:01 Lwxiang

i.e. you could most likely put any code there (e.g. sleeping for 1ms) and you'll see the memory usage increase.

YES, I have tried as you said.

The point is not torrent_file() or status(), the point is the processed time. the longer processed time it takes, the more memory will exist simultaneously. So if I just usleep(10000) can also reproduce the case.

So maybe the unbound number of read_piece_alerts sending into queue prevent the old alerts from releasing?

Jan 29 '23 07:01 Lwxiang

If I call torrent_file() or other high-cost procedures in read_piece_alerts handler, it may cause that the current generation alerts have a longer life(before clean) than before. Which also causes the next generation alerts have more read_alerts(same speed of incoming alerts), then the next generation have more alive time.

as the following pic:

The memory usage increase higher and higher during processing because the alerts have a longer and longer life time. But notice the last generation, only takes 2 seconds to process 5k, this because the file checking finished and torrent status changes to seeding.

so the questions become to followings:

why file checking will slow down the process speed of torrent_file() and other procedures?
why the high memory still remain after all is done?

Jan 29 '23 09:01 Lwxiang

why file checking will slow down the process speed of torrent_file() and other procedures?

#6996 libtorrent doesn't use mutex-based approach to obtain some data from the working thread but it queues "requests" to be invoked in that thread so you need waiting until all current jobs are done.

Jan 29 '23 10:01 glassez

For now, I just forbid the read-piece actions when torrent is under checking-files state to prevent the case.

Feb 02 '23 08:02 Lwxiang

@Lwxiang it seems to me that the problem with unbounded memory usage is fundamentally caused by you not having a limit on the number of pieces you keep in RAM at a time.

Is there a reason you cannot have a cap on that? It doesn't seem very robust to assume that you can handle alerts at a high enough rate to stay within some reasonable memory usage. It might work under the right circumstances on your computer, but perhaps not on other's computers.

Feb 02 '23 10:02 arvidn

I see, the more number of pieces I keep in RAM at a time, the more memory usage it takes at a time, that is right. What I don't understand is why, after all read_pieces_alert is cleaned, the memory usage is still high, and never come down. (I don't keep any data of the piece buffer)

What I suppose to see is, the memory usage goes up, maybe very high, but finally it will fall down after all actions done.

But what I actually see is, if I call multiple read_piece during checking-files state, the memory is like following, even after I remove the torrent.

Feb 02 '23 12:02 Lwxiang

understanding that requires to first understand what you mean when you say "memory usage". There are at least 3 levels involved:

The program calling malloc() and free()
The runtime library implementing malloc() and free()
The kernel allocating and deallocating address space for the process

I expect that you look at resident pages, which is in the realm of (3) whereas libtorrent is in (1). Perhaps the run-time library is predicting that the process may have another spike in memory allocations any moment, and defers returning parts of its address space to the kernel. Or perhaps it ended up being fragmented making it difficult for the runtime library to return it.

In order to know whose behavior you need to understand, it would make sense to also profile the memory usage at the malloc() and free() level. With a heap profiler for instance.

Feb 03 '23 17:02 arvidn

note that the alerts from the last pop_alerts are not freed until you call pop_alerts() again.

Feb 03 '23 17:02 arvidn

I try to look into this problem again recently,

I found that, if I do lots of read_piece during file-checking, the size of read_lru1 list in block_cache will be pretty high, and will not be released after all actions done.

I think maybe the maybe_free_piece skip some evict actions, look the functions following, we skip the evict if pe->marked_for_eviction is false.

but I think it should be evicted with allow_ghost? the following call use the marked_for_eviction to decide the params of evict_piece. maybe we should delete the line || !pe->marked_for_eviction?

bool block_cache::maybe_free_piece(cached_piece_entry* pe)
{
	if (!pe->ok_to_evict()
		|| !pe->marked_for_eviction
		|| !pe->jobs.empty())
		return false;

	DLOG(stderr, "[%p] block_cache maybe_free_piece "
		"piece: %d refcount: %d marked_for_eviction: %d\n"
		, static_cast<void*>(this)
		, int(pe->piece), int(pe->refcount), int(pe->marked_for_eviction));

	tailqueue<disk_io_job> jobs;
	bool removed = evict_piece(pe, jobs
		, pe->marked_for_deletion ? disallow_ghost : allow_ghost);
	TORRENT_UNUSED(removed); // suppress warning
	TORRENT_PIECE_ASSERT(removed, pe);
	TORRENT_PIECE_ASSERT(jobs.empty(), pe);

	return true;
}

by the way, is there any block cache in version 2.0.x? I can not find it.

May 15 '23 09:05 Lwxiang

I found that, if I do lots of read_piece during file-checking, the size of read_lru1 list in block_cache will be pretty high, and will not be released after all actions done.

It's only released if the cache size is near its max size. is "pretty high" greater than your cache size?

It seems pretty obvious that the memory is used up by read_piece_alerts, holding a buffer for a whole piece. Can you confirm that that's not the case?

by the way, is there any block cache in version 2.0.x? I can not find it.

No, there's not. it uses the page cache (via memory mapped files) instead.

May 15 '23 10:05 arvidn

two parts, read_piece_alerts holds some memory, and the block_cache holds another part.

according to the code, it seems only when move piece to the ghost list has the max_size, the lru1 list has not.

the || !pe->marked_for_eviction condition makes the following call evict_piece will never get allow_ghost

May 15 '23 10:05 Lwxiang

I monitor the cache list size during test.

by default the list size is 32, but the lru1 size increases to thousands and not be released after all actions done.

May 15 '23 10:05 Lwxiang

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sep 17 '23 01:09 stale[bot]

memory leak when calling handle.torrent_file() from read_piece_alert