rocksdb icon indicating copy to clipboard operation
rocksdb copied to clipboard

Slow memory leak with rocksdbjni

Open areyohrahul opened this issue 9 months ago • 17 comments

Hello,

I'm currently using version 8.5.3 of rocksdbjni and I've encountered a persistent memory leak issue. After researching similar problems, I've identified a few potential causes:

One common issue is failing to properly close the RocksObject from Java, as discussed thoroughly in this GitHub issue (https://github.com/facebook/rocksdb/issues/9962). This problem appears to have surfaced in versions >= 7.

I attempted to address this by patching the AbstractNativeReference and AbstractImmutableNativeReference classes in version 8.5.3 with those from version 6.29.5. However, the memory leak persisted. This led me to believe that the issue may not lie in the object closures. Weirdly though, the issues were completely gone when I downgraded to version 6.29.5

Another possibility pertains to cache-related configurations. If index and filter blocks are not cached within the Block cache, they can lead to memory leaks due to their practically limitless nature. I've already configured my blocks to be cached in the block cache with strict_capacity = true, so this doesn't appear to be the root cause.

A third potential issue relates to the use of a different memory allocator. Several similar cases have been reported, as referenced in links like https://github.com/facebook/rocksdb/issues/10324, http://smalldatum.blogspot.com/2015/10/myrocks-versus-allocators-glibc.html, and https://blog.cloudflare.com/the-effect-of-switching-to-tcmalloc-on-rocksdb-memory-use/.

I'm uncertain about how to implement this with RocksDBJNI. Is recompiling rocks with a special flag necessary, or is LD_PRELOAD sufficient? I attempted using LD_PRELOAD, but I'm unsure if it had the desired effect, as memory usage continued to increase slowly.

Additionally, I have a few lingering questions:

  • If downgrading to version 6.29.5 resolved the issue, could it still be related to a memory allocator problem?
  • If the issue lies in improper closure of RocksObjects, wouldn't this have been rectified by patching version 8.5.3 with 6.29.5?
  • It seems that memory is not being reclaimed even after closing the rocksdb object. A similar issue is outlined here: https://github.com/facebook/rocksdb/issues/10324.

I hope this provides a clearer overview of the situation. Any insights or suggestions would be greatly appreciated.

Thank you!

areyohrahul avatar Oct 26 '23 09:10 areyohrahul

Is there a way to reproduce this in a localized, longevity benchmark?

vjeko avatar Oct 26 '23 20:10 vjeko

@areyohrahul We are going to take a look at this and try to help you work out what is going on and help you resolve this issue. My colleague @alanpaxton should be able to assist you with this.

cmccrorie avatar Oct 30 '23 11:10 cmccrorie

rocksdb_memory.zip

This code reproduces the memory leak on my mac. Haven't tried it on linux though. First writes 10M random POJOs to the DB. Then reads 10M entries. Then, an inifinite loop starts which does a read-modify-wrtie operation on the DB. You need to change the directory for the RocksDB in Rocks.java to run it locally

areyohrahul avatar Oct 30 '23 16:10 areyohrahul

Thanks @cmccrorie . Please let me know if anything is required to better help solve this.

areyohrahul avatar Oct 30 '23 16:10 areyohrahul

Thanks @areyohrahul - a repro is just what I need. I am on macos, so I will give it a try.

alanpaxton avatar Oct 30 '23 17:10 alanpaxton

Hi @areyohrahul could you characterise the leak as to how slow it is ? What numbers are you seeing, from which tools ? How long do you need to run it to see the leak ?

I have analysed the repro running with the YourKit profiler, and I do not see any significant pure Java leaks. This is not a surprise; any real problem is likely to be some C++ RocksDB data allocated and not freed by the Java native code. I'm going to look at it with jemalloc, but it will help me if I knew what scale of leak I am looking for. Thanks..

some specifics; when the process reaches the update phase it sits at anywhere between 12-15GB of memory (via activity monitor) but of that 10GB is accounted for by the RocksDB LRUCache you have configured; there is a running JVM and other RocksDB buffers, etc, so that does not seem in the least unreasonable. But I do not see any consistent increase in the memory usage. I haven't even needed to try jemalloc yet; I will give the repro longer to run to see if I notice any significant increase in memory use, and then see if a jemalloc run can show me anything.

alanpaxton avatar Oct 31 '23 10:10 alanpaxton

Hi @alanpaxton, the memory leak is quite slow for my system. It increased by 1-2% everyday in a 80G machine.

I use 2 commands to check the memory:

  1. "sar -r 1" command shows increase in %memused (over a very long time)
  2. "top" command shows increased RES memory over time.

This is running on Deb 11 (bullseye) with Java 11. On a side not, in another forum, I raised a similar issue and it was not reproducible on Ubuntu 22.04.3 LTS with Java 17. What system are you using?

Also, I tried using tcmalloc with this, but on debain 11 (bullseye), RocksDB throws some "double free pointer" error. The same code works on Ubuntu with tcmalloc. (wanted to give it a try based on this blog https://blog.cloudflare.com/the-effect-of-switching-to-tcmalloc-on-rocksdb-memory-use/)

One last thing, Rocks version 6.29.5 shows very negligible memory leak compared to version 8.5.3. I thought that the reason for it was this commit https://github.com/facebook/rocksdb/pull/9523 So, I patched these 2 classes (AbstractNativeReference and AbstractImmutableNativeReference to check if somehow a RocksObject was not being closed) from version 6.29.5 in version 8.5.3 but it didn't help.

One last thing, it's very weird that using RocksDB 8.5.3 takes a lot of extra memory compared to version 6.29.5 right at the start of the process.

areyohrahul avatar Nov 01 '23 08:11 areyohrahul

Can you share your full configuration file for RocksDB please?

Also we need a discrete set of steps from start to finish that we can perform that will l exactly replicate your problem please

On Wed, 1 Nov 2023, 09:26 Rahul Arora, @.***> wrote:

Hi @alanpaxton https://github.com/alanpaxton, the memory leak is quite slow for my system. It increased by 1-2% everyday in a 80G machine.

I use 2 commands to check the memory:

  1. "sar -r 1" command shows increase in %memused
  2. "top" command shows increased RES memory over time.

This is running on Deb 11 (bullseye) with Java 11. On a side not, in another forum, I raised a similar issue and it was not reproducible on Ubuntu 22.04.3 LTS with Java 17. What system are you using?

Also, I tried using tcmalloc with this, but on debain 11 (bullseye), RocksDB throws some "double free pointer" error. The same code works on Ubuntu with tcmalloc.

One last thing, Rocks version 6.29.5 doesn't shows very negligible memory leak compared to version 8.5.3. I thought that the reason for it was this commit #9523 https://github.com/facebook/rocksdb/pull/9523 So, I patched these 2 classes from version 6.29.5 in version 8.5.3 but it didn't help.

— Reply to this email directly, view it on GitHub https://github.com/facebook/rocksdb/issues/12020#issuecomment-1788586925, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJUTONUYPSHG4SNOSN5GH3YCIBSZAVCNFSM6AAAAAA6Q45ZOKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBYGU4DMOJSGU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

adamretter avatar Nov 01 '23 08:11 adamretter

Hey @adamretter and @alanpaxton , I'll try to be as detailed as possible. Still, if anything more is required then please let me know.

My environment config is given below:

Debian 11 (bullseye) Java 11.0.20 Rocks version 8.5.3 ldd (Debian GLIBC 2.31-13+deb11u6) 2.31 16 cores, 80 GB machine

I've changed the code a bit and have increased the number of threads for writing, reading, and updating. I'm attaching the zip of the project with this comment. You can directly use the JAR in the target folder of the project.

Link

Follow the following steps to run the JAR

  1. Either change the directory of RocksDB in Rocks.Java (Line 50) or create a directory /var/lib/mustang
  2. SCP target/core-1.0-SNAPSHOT-jar-with-dependencies.jar to the remote machine
  3. SSH into the machine and run java -cp core-1.0-SNAPSHOT-jar-with-dependencies.jar com.rocksdb.test.Main

After running this script for a while, I can see that as soon as the read starts. My RSS goes up to about 30% (on a 80G machine. Even more if the reads are done in an infinite loop) and then increases very very slowly during the updates.

As mentioned earlier, I check the memory using sar -r 1 and top commands. I'm attaching the output of the same below:

sar -r 1

Screenshot 2023-11-02 at 1 16 48 AM

top

3779575 root 20 0 37.1g 22.6g 28068 S 723.3 28.7 542:31.43 java

Along with this, I'm also attaching my LOG file. This might be useful to you. rocks_example_logs.txt

Adding some major config parameters used for the program although it can be found in the code:

LRU size - 5 GB with strict capacity Max Memtables - 5 Min memtables to merge - 3 Memtable size - 100 MB Max open files - 100 Pinning L0 index and filter blocks - True Caching Index and Filter blocks - True Cache Index and Filter blocks with high priority - True

Some additional information:

The same code behaves better on an Ubuntu 22.04 machine with Java 11 and 4 core, 16 GB config.

UPDATE:

I modified the same code a bit and have added an infinite loop for the reads. The current RSS is a little over 30 GB. Based on my calculations, it should have been less than 10 GB (5 GB cache + 1 GB memtables + 512 MB heap + 1-2 GB other JVM native memory).

areyohrahul avatar Nov 01 '23 19:11 areyohrahul

@adamretter

Also, I've mentioned this in one of my previous comments.

TCmalloc doesn't work with RocksDB on Debian 11. My application fails to start with "double free or corruption (out)" error. If I remove RocksdB from the picture, then tcamlloc runs fine with my app even on Debian.

The same code with RocksDB works fine on Ubuntu 22.04. Should I raise a separate issue for this?

I use LD_PRELOAD env variable to set tcmalloc.

areyohrahul avatar Nov 01 '23 20:11 areyohrahul

Hi @areyohrahul I have been working to repro your problem on an ubuntu 20 (more or less debian 11) openjdk version "17.0.8.1" 2023-08-24 Running a version of your test code which I rebuilt with a smaller MAX_ENTRIES to get it to the interesting point further, and to run on my somewhat smaller machine,I can confirm that over an hour or so there does seem to be a creeping upwards of the %mem in RSS.

So I installed jemalloc and ran again

$ sudo apt install libjemalloc2

This installs jemalloc for me at /usr/lib/x86_64-linux-gun/libjemalloc.so.2 and I can then run the test code using jemalloc via

$ LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.2 MALLOC_CONF=prof:true,lg_prof_interval:30,lg_prof_sample:17 java -cp target/core-1.0-SNAPSHOT-jar-with-dependencies.jar com.rocksdb.test.Main

Again I see the %mem increasing very slowly, and jeprof regularly generates heap checkpoint files of the form

jeprof.<PID>.<index>.i<index>.heap

I can then compare later and earlier heap samples, and generate a text report by running (where PID=945176, and I have compared the 1000th and 600th heap samples)

jeprof --show_bytes --text `which java` jeprof.945176.1000.i1000.heap --base jeprof.945176.600.i600.heap > i1000i600.txt

Examining i1000i600.txt tells me that there are the following differences in allocation between the 2 checkpoints:

55783653  74.8%  74.8% 55783653  74.8% rocksdb::BlockFetcher::ReadBlockContents
18614440  25.0%  99.8% 18614440  25.0% rocksdb::Arena::AllocateNewBlock
  786720   1.1% 100.9%   786720   1.1% rocksdb::lru_cache::LRUCacheShard::CreateHandle
  418564   0.6% 101.4%   418564   0.6% os::malloc
  323655   0.4% 101.9%   323655   0.4% std::vector::emplace_back
  262368   0.4% 102.2%   262368   0.4% rocksdb::VersionBuilder::Rep::ApplyFileAddition
  131328   0.2% 102.4%   131328   0.2% rocksdb::ConcurrentArena::ConcurrentArena
  131328   0.2% 102.6%   131328   0.2% rocksdb::FilterPolicy::CreateFromString

Of course this can be perfectly reasonably explained by block cache being at a high water mark at checkpoint 1000, rather than having a leak per-se. But it offers a start as to where to look.

Do you want to try and follow what I have done to confirm that you can run the testbench in your own environment using jemalloc ?

alanpaxton avatar Nov 07 '23 16:11 alanpaxton

Hi @alanpaxton, I can give it a shot, but I wanted to inform you that I attempted using both jemalloc and tcmalloc with my application, and unfortunately, the memory continued to increase with both of them.

I do have a couple of questions though.

Which system were you using previously where you didn't encounter any memory leaks? Additionally, have you let this program run for an extended period of time and observed a total OOM error?

areyohrahul avatar Nov 08 '23 05:11 areyohrahul

Hi @areyohrahul perhaps I wasn't clear. It's not a surprise that memory grows even with jemalloc; but given that it does, I think it would be helpful for you to run your workload with the jemalloc options I've suggested, producing heap logs as you go, and analysing these heap logs with jeprof. When your memory gets very full the text dumps of the latest logs should give some idea what is holding all the memory.

alanpaxton avatar Nov 08 '23 09:11 alanpaxton

@areyohrahul is this resolved now?

adamretter avatar Jan 23 '24 21:01 adamretter

Hey @adamretter, this is still not resolved but I have some more observations. I'll try to summarise it here with some context of the workload.

In my use case, rocks is used to read around 10 M records initially and then concurrent (around 1K RPS) read-update-write operations are done on it forever. This read-update-write phase is when I see the slow memory leak.

I'm attaching a screenshot of a script that monitors the RSS and VSZ of a process every second. This will give you an idea about how slow the memory leak is in my case.

At the start of the application <Time> <RSS> <VSZ>

Screenshot 2024-02-08 at 10 41 44 PM

After a few hours

Screenshot 2024-02-08 at 10 42 40 PM

I believe some other info will be required before we jump onto some conclusion about the memory leak here. Following are some data points related to this application:

  1. Process heap size is 40G
  2. No block cache is used
  3. Total memtable size (mutable and immutable) is 1G

Based on this, my RSS shouldn't have been more than 42-45G (given some native memory is used by JVM as well) but it clearly is and it's increasing every hour.

Now, I'll show you the data of jemalloc profilling that was suggested by @alanpaxton

I ran the following command to compare 2 slices of heap:

jemalloc/bin/./jeprof --show_bytes --text which java /logs/dumps/heap.2605409.2476.i2476.heap --base /var/lib/mustang/server/logs/dumps/heap.2605409.1000.i1000.heap

A subset of the total output is given below:

Total: 222309691 B
210969191  94.9%  94.9% 210969191  94.9% rocksdb::Arena::AllocateNewBlock
12095206   5.4% 100.3% 12095206   5.4% rocksdb::UncompressBlockData
 1158527   0.5% 100.9%  1158527   0.5% std::string::_Rep::_S_create@@GLIBCXX_3.4
  655600   0.3% 101.2%   655600   0.3% rocksdb::BlockCreateContext::Create
  393360   0.2% 101.3%   393360   0.2% rocksdb::lru_cache::LRUCacheShard::CreateHandle
  267742   0.1% 101.5%   267742   0.1% SUNWprivate_1.1
  262192   0.1% 101.6%   262192   0.1% std::_Rb_tree::_M_copy
  131096   0.1% 101.6%   131096   0.1% std::_Rb_tree::_M_insert_unique
  131088   0.1% 101.7%   131088   0.1% rocksdb::VersionStorageInfo::GenerateFileLocationIndex

I'm attaching a little-bit stripped down version of my exact code below:

I still don't know what am I doing wrong here but please feel free to ask any more info regarding this if required.

Some other info about the system:

Java - openjdk version "11.0.18" 2023-01-17
OS - Debian GNU/Linux 11 (bullseye)

areyohrahul avatar Feb 08 '24 17:02 areyohrahul

Please excuse me for directly adding the code in the comment below but I'm unable to upload any file from my laptop.

public class RocksNRTBackend extends NRTBackend {

    private final static ObjectMapper OBJECT_MAPPER = new ObjectMapper();
    private final static Integer MAX_NUM_MEMTABLES = 6;
    private final static Integer MIN_NUM_IMMUTABLE_MEMTABLES = 3;
    private final static Integer FILTER_BITS = 15;
    private final static Long MEMTABLE_SIZE_IN_BYTES = 100 * 1024 * 1024L;
    private final static String DUMP_DIRECTORY = BootstrapOrchestrator.DUMP_DIRECTORY;
    private final static String DOMAIN = "f";
    private final static String TENANT = "s";
    private final static String LISTING_ENTITY = "n";
    private final static String METADATA_ENTITY = "m";
    private final static String LISTING_COLUMN_FAMILY_NAME = "l";
    private final static BlockingQueue<ListingEntity> asyncWriteQueue = new LinkedBlockingQueue<>();
    private static final Timer GET_TIMER = Metrics.newTimer(RocksNRTBackend.class, "rocks-get-timer");
    private static final Timer SAVE_TIMER = Metrics.newTimer(RocksNRTBackend.class, "rocks-save-timer");
    private static final Meter GET_METER = Metrics.newMeter(RocksNRTBackend.class, "rocks-get-counter", "rocks-get-counter", TimeUnit.SECONDS);
    private static final Meter SAVE_METER = Metrics.newMeter(RocksNRTBackend.class, "rocks-save-counter", "rocks-save-counter", TimeUnit.SECONDS);
    private final Statistics statistics;
    private final DBOptions dbOptions;
    private final BloomFilter listingBloomFilter;
    private final BlockBasedTableConfig listingBlockBasedTableConfig;
    private final ColumnFamilyOptions listingColumnFamilyOptions;
    private final ColumnFamilyOptions metadataColumnFamilyOptions;
    private final EntityIdentifier LISTING_ENTITY_IDENTIFIER;
    private final EntityIdentifier METADATA_ENTITY_IDENTIFIER;
    private final ScheduledExecutorService executorService = Executors.newScheduledThreadPool(2);

    private static volatile RocksNRTBackend INSTANCE;

    private List<ColumnFamilyHandle> cfhandle;

    List<ColumnFamilyHandle> columnFamilyHandles = new ArrayList<>();

    private RocksDB rocksDB;


    public static RocksNRTBackend getInstance() {
        if (INSTANCE == null) {
            synchronized (RocksNRTBackend.class) {
                if (INSTANCE == null) {
                    INSTANCE = new RocksNRTBackend();
                }
            }
        }
        return INSTANCE;
    }

    private RocksNRTBackend() {
        RocksDB.loadLibrary();

        LISTING_ENTITY_IDENTIFIER = new EntityIdentifier(DOMAIN, TENANT, LISTING_ENTITY);
        METADATA_ENTITY_IDENTIFIER = new EntityIdentifier(DOMAIN, TENANT, METADATA_ENTITY);

        statistics = new Statistics();
        statistics.setStatsLevel(StatsLevel.ALL);

        dbOptions = new DBOptions();
        dbOptions.setCreateIfMissing(Boolean.TRUE);
        dbOptions.setCreateMissingColumnFamilies(Boolean.TRUE);
        dbOptions.setAllowConcurrentMemtableWrite(true);
        dbOptions.setIncreaseParallelism(Runtime.getRuntime().availableProcessors());
        dbOptions.setMaxOpenFiles(100);
        dbOptions.setStatistics(statistics);
        dbOptions.setStatsDumpPeriodSec(30);
        dbOptions.setUseDirectIoForFlushAndCompaction(false);

        listingBloomFilter = new BloomFilter(FILTER_BITS);

        listingBlockBasedTableConfig = new BlockBasedTableConfig();
        listingBlockBasedTableConfig.setNoBlockCache(true);
        listingBlockBasedTableConfig.setPinL0FilterAndIndexBlocksInCache(false);
        listingBlockBasedTableConfig.setCacheIndexAndFilterBlocks(false);
        listingBlockBasedTableConfig.setCacheIndexAndFilterBlocksWithHighPriority(false);
        listingBlockBasedTableConfig.setFilterPolicy(listingBloomFilter);

        listingColumnFamilyOptions = new ColumnFamilyOptions();
        listingColumnFamilyOptions.setCompressionType(CompressionType.LZ4_COMPRESSION);
        listingColumnFamilyOptions.setBottommostCompressionType(CompressionType.LZ4_COMPRESSION);
        listingColumnFamilyOptions.setNumLevels(3);
        listingColumnFamilyOptions.setTableFormatConfig(listingBlockBasedTableConfig);
        listingColumnFamilyOptions.setWriteBufferSize(MEMTABLE_SIZE_IN_BYTES);
        listingColumnFamilyOptions.setMaxWriteBufferNumber(MAX_NUM_MEMTABLES);
        listingColumnFamilyOptions.setMinWriteBufferNumberToMerge(MIN_NUM_IMMUTABLE_MEMTABLES);
        listingColumnFamilyOptions.setCompactionStyle(CompactionStyle.LEVEL);

        metadataColumnFamilyOptions = new ColumnFamilyOptions();

        RocksColumnFamilyDescriptor listingBlobDescriptor = RocksColumnFamilyDescriptor.builder()
                .registryKey(LISTING_ENTITY_IDENTIFIER.getRegistryKey())
                .columnFamilyName(LISTING_COLUMN_FAMILY_NAME)
                .columnFamilyOptions(listingColumnFamilyOptions)
                .build();

        RocksColumnFamilyDescriptor metadataBlobDescriptor = RocksColumnFamilyDescriptor.builder()
                .registryKey(METADATA_ENTITY_IDENTIFIER.getRegistryKey())
                .columnFamilyName(LISTING_COLUMN_FAMILY_NAME)
                .columnFamilyOptions(metadataColumnFamilyOptions)
                .build();

        try {
            List<ColumnFamilyDescriptor> descriptors = new ArrayList<>(parseAndSaveDescriptors(Arrays.asList(listingBlobDescriptor, metadataBlobDescriptor)));
            fillDefaultDescriptorIfAbsent(DUMP_DIRECTORY, descriptors);
            this.rocksDB = RocksDB.open(this.dbOptions, DUMP_DIRECTORY, descriptors, columnFamilyHandles);
            cfhandle = Collections.singletonList(columnFamilyHandles.get(0));
        } catch (Exception ex) {
            log.info("RocksDB: Initialisation failed for RocksDB lib", ex);
        }
    }

    private static List<ColumnFamilyDescriptor> parseAndSaveDescriptors(List<RocksColumnFamilyDescriptor> descriptors) throws RocksValidationException {
        List<ColumnFamilyDescriptor> columnFamilyDescriptors = new ArrayList<>();

        for (RocksColumnFamilyDescriptor rocksColumnFamilyDescriptor: descriptors) {
            if (rocksColumnFamilyDescriptor.getRegistryKey() == null) {
                throw new RocksValidationException("Registry key cannot be null for a descriptor");
            }
            if (rocksColumnFamilyDescriptor.getColumnFamilyName() == null) {
                throw new RocksValidationException("Column family name cannot be null for a descriptor");
            }
            if (rocksColumnFamilyDescriptor.getColumnFamilyOptions() == null) {
                throw new RocksValidationException("Column family options cannot be null for a descriptor");
            }
            ColumnFamilyDescriptor columnFamilyDescriptor = new ColumnFamilyDescriptor(
                    (rocksColumnFamilyDescriptor.getRegistryKey() + RocksConstants.KEY_DELIMITER + rocksColumnFamilyDescriptor.getColumnFamilyName()).getBytes(),
                    rocksColumnFamilyDescriptor.getColumnFamilyOptions());
            columnFamilyDescriptors.add(columnFamilyDescriptor);
        }

        return columnFamilyDescriptors;
    }

    protected void fillDefaultDescriptorIfAbsent(String dirPath, List<ColumnFamilyDescriptor> descriptors) throws Exception {
        List<byte[]> storedColumnFamilies = RocksDB.listColumnFamilies(new Options(), dirPath);

        for (byte[] storedColumnFamily: storedColumnFamilies) {
            boolean isColumnFamilyDescriptorPresent = false;
            for (ColumnFamilyDescriptor columnFamilyDescriptor: descriptors) {
                if (Arrays.equals(columnFamilyDescriptor.getName(), storedColumnFamily)) {
                    isColumnFamilyDescriptorPresent = true;
                    break;
                }
            }
            if (!isColumnFamilyDescriptorPresent) {
                addColumnFamilyDescriptor(storedColumnFamily, descriptors);
            }
        }

        // Case for the first start
        if (CollectionUtils.isEmpty(storedColumnFamilies)) {
            addColumnFamilyDescriptor("default".getBytes(), descriptors);
        }
    }

    protected void addColumnFamilyDescriptor(byte[] columnFamilyName, List<ColumnFamilyDescriptor> descriptors) {
        ColumnFamilyDescriptor defaultDescriptor = new ColumnFamilyDescriptor(columnFamilyName);
        defaultDescriptor.getOptions().setWriteBufferSize(RocksConstants.DEFAULT_WRITE_BUFFER_SIZE)
                .setMaxWriteBufferNumber(RocksConstants.DEFAULT_MAX_WRITE_BUFFER_NUMBER)
                .setMinWriteBufferNumberToMerge(RocksConstants.DEFAULT_MIN_WRITE_BUFFER_NUMBER_TO_MERGE)
                .setCompactionStyle(RocksConstants.DEFAULT_COMPACTION_STYLE);
        descriptors.add(defaultDescriptor);
    }

    @Override
    public boolean isAsyncQueueEmpty() {
        return asyncWriteQueue.isEmpty();
    }

    @Override
    public synchronized void close() throws Exception {
        executorService.shutdown();
        rocksDB.flush(new FlushOptions());
        rocksDB.close();
        statistics.close();
        listingColumnFamilyOptions.close();
        listingBloomFilter.close();
        metadataColumnFamilyOptions.close();
        dbOptions.close();
        INSTANCE = null;
    }

    public void save(List<ListingEntity> listings) {
        if (!ConfigServiceConfigs.getInstance().isBootstrapOptimisationEnabled()) {
            return;
        }
        long start = System.currentTimeMillis();
        if (listings.isEmpty()) {
            return;
        }
        TimerContext timer = SAVE_TIMER.time();
        try {
            WriteOptions writeOptions = new WriteOptions();
            WriteBatch writeBatch = new WriteBatch();
            for (ListingEntity listing : listings) {
                writeBatch.put(columnFamilyHandles.get(0), listing.getListingId().getBytes(), OBJECT_MAPPER.writeValueAsBytes(listing));
            }
            this.rocksDB.write(writeOptions, writeBatch);
            writeOptions.close();
            writeBatch.close();
            long deltaTime = System.currentTimeMillis() - start;
            if (deltaTime > 100) {
                log.info("RocksDB: Flushed data for {} listing(s) in {} ms", listings.size(), deltaTime);
            }
        } catch (Exception e) {
            log.error("RocksDB: Error occurred while saving listing(s)", e);
        }
        timer.stop();
        SAVE_METER.mark(listings.size());
    }

    @Override
    public void saveAsync(List<ListingEntity> listings) {
        if (!ConfigServiceConfigs.getInstance().isBootstrapOptimisationEnabled()) {
            return;
        }
        if (listings.isEmpty()) {
            return;
        }
        asyncWriteQueue.addAll(listings);
    }

    public Map<String, ListingEntity> get(List<String> listingIds) {
        if (!ConfigServiceConfigs.getInstance().isBootstrapOptimisationEnabled()) {
            return Collections.emptyMap();
        }
        if (listingIds.isEmpty()) {
            return Collections.emptyMap();
        }
        Map<String, ListingEntity> parsedData = new HashMap<>(listingIds.size());
        long start = System.currentTimeMillis();
        TimerContext timer = GET_TIMER.time();
        try {
            ReadOptions readOptions = new ReadOptions();
            List<byte[]> keys = new ArrayList<>();
            for (String listingId : listingIds) {
                keys.add(listingId.getBytes());
            }
            List<byte[]> response = rocksDB.multiGetAsList(readOptions, Collections.nCopies(listingIds.size(), columnFamilyHandles.get(0)), keys);
            for (byte[] data : response) {
                ListingEntity listing = OBJECT_MAPPER.readValue(data, ListingEntity.class);
                parsedData.put(listing.getListingId(), listing);
            }
            readOptions.close();
            long deltaTime = System.currentTimeMillis() - start;
            if (deltaTime > 100) {
                log.info("RocksDB: Loaded data for {} listing(s) in {} ms", parsedData.size(), deltaTime);
            }
        } catch (Exception e) {
            log.error("RocksDB: Error occurred while bootstrapping listing(s)", e);
        }
        timer.stop();
        GET_METER.mark(listingIds.size());
        return parsedData;
    }
}

areyohrahul avatar Feb 08 '24 17:02 areyohrahul

Hi @areyohrahul I have tried again to run your jar, and for myself I can't see any increase in memory usage even over a number of hours. Your jeprof output suggests that for you the memory usage has increased by about 200MB from i1000 to i2476. Is the usage consistently increasing on the samples in between, or does it go up and down ? Does the increase account for all the extra memory you are seeing in your RSS/VSZ script ? Can you read anything from the graph if you do jeprof --pdf (I'd be interested in seeing such a graph)

alanpaxton avatar Feb 13 '24 16:02 alanpaxton

Closing due to a lack of follow-up from the poster.

adamretter avatar Mar 25 '24 11:03 adamretter