forest icon indicating copy to clipboard operation
forest copied to clipboard

snapshot validate: unbound memory

Open LesnyRumcajs opened this issue 1 year ago • 22 comments

Issue summary

The memory needed to validate a mainnet snapshot seems unbound. On a 16 GB RAM machines, it's causing OOM after a minute or so.

According to @lemmih, the culprit is the FVM

Sigh, I think the FVM is taking up 19GiB of RAM. We'll need to address that at some point.

Command: forest-cli snapshot validate <mainnet snapshot> commit: b03ca5d61a18c236fcaa9bfebeae706108e6ed85

Other information and links

LesnyRumcajs avatar Aug 01 '23 13:08 LesnyRumcajs

This kills one of our killer features :/

aatifsyed avatar Aug 01 '23 15:08 aatifsyed

Could it be mitigated by limiting parallelization in validate_tipsets?

hanabi1224 avatar Aug 01 '23 15:08 hanabi1224

Could it be mitigated by limiting parallelization in validate_tipsets?

Yep, thus killing our killer feature.

lemmih avatar Aug 08 '23 07:08 lemmih

I think the WASM engine settings might be to blame. https://github.com/filecoin-project/ref-fvm/blob/f31c6d3a64278f98270e5a13fc6e8be11e5c534e/fvm/src/engine/mod.rs#L137

    // wasmtime default: OnDemand
    // We want to pre-allocate all permissible memory to support the maximum allowed recursion limit.

Things to investigate:

  • [ ] Does re-initializing the MultiEngine reset the memory usage?
  • [ ] Do different wasm_config settings affect memory usage?

lemmih avatar Aug 08 '23 07:08 lemmih

Isn't wasm32 limited to 4 GB?

LesnyRumcajs avatar Aug 08 '23 07:08 LesnyRumcajs

Isn't wasm32 limited to 4 GB?

I think they even lower the limit from 4GiB to 2GiB. But they have a pool of engines, one for each core, each with a 2GiB limit.

    /// Maximum size of memory used during the entire (recursive) message execution. This currently
    /// includes Wasm memories and table elements and will eventually be extended to include IPLD
    /// blocks and actor code.
    ///
    /// DEFAULT: 2GiB
    pub max_memory_bytes: u64,
    // wasmtime default: 4GB
    c.static_memory_maximum_size(instance_memory_maximum_size);

lemmih avatar Aug 08 '23 07:08 lemmih

So on my 32 cores it would require 64GB?

LesnyRumcajs avatar Aug 08 '23 07:08 LesnyRumcajs

So on my 32 cores it would require 64GB?

As far as I can tell, yes.

lemmih avatar Aug 08 '23 07:08 lemmih

Change Description Network No of Threads Epochs Validated Snapshot Info RSS VSZ
BaseLine Calibnet 1 60 forest_snapshot_calibnet_2023-08-14_height_822490.forest.car.zst(1.9Gb) 739.01 MB 3036361.56 MB
BaseLine Calibnet 2 60 forest_snapshot_calibnet_2023-08-14_height_822490.forest.car.zst(1.9Gb) 762.50 MB 3036432.92 MB
BaseLine Calibnet 4 60 forest_snapshot_calibnet_2023-08-14_height_822490.forest.car.zst(1.9Gb) 846.67 MB 3036739.99 MB
BaseLine Calibnet 8 60 forest_snapshot_calibnet_2023-08-14_height_822490.forest.car.zst(1.9Gb) 866.19 MB 3037035.18 MB
BaseLine Calibnet 1 1999 forest_snapshot_calibnet_2023-08-14_height_822490.forest.car.zst(1.9Gb) 898.11 MB 3036352.50 MB
BaseLine Calibnet 2 1999 forest_snapshot_calibnet_2023-08-14_height_822490.forest.car.zst(1.9Gb) 878.01 MB 3036441.77 MB
BaseLine Calibnet 4 1999 forest_snapshot_calibnet_2023-08-14_height_822490.forest.car.zst(1.9Gb) 900.31 MB 3036771.31 MB
BaseLine Calibnet 8 1999 forest_snapshot_calibnet_2023-08-14_height_822490.forest.car.zst(1.9Gb) 934.74 MB 3037106.39 MB
BaseLine Mainnet 1 60 forest_snapshot_mainnet_2023-08-14_height_3122221.forest.car.zst(57Gb) 4020.62 MB 3048476.51 MB
BaseLine Mainnet 2 60 forest_snapshot_mainnet_2023-08-14_height_3122221.forest.car.zst(57Gb) 4056.17 MB 3048855.77 MB
BaseLine Mainnet 4 60 forest_snapshot_mainnet_2023-08-14_height_3122221.forest.car.zst(57Gb) 4107.39 MB 3048985.30 MB
BaseLine Mainnet 8 60 forest_snapshot_mainnet_2023-08-14_height_3122221.forest.car.zst(57Gb) 4088.47 MB 3048548.39 MB
BaseLine Mainnet 1 120 forest_snapshot_mainnet_2023-08-14_height_3122221.forest.car.zst(57Gb) 4519.46 MB 3049692.50 MB
BaseLine Mainnet 2 120 forest_snapshot_mainnet_2023-08-14_height_3122221.forest.car.zst(57Gb) 4561.81 MB 3049611.76 MB
BaseLine Mainnet 4 120 forest_snapshot_mainnet_2023-08-14_height_3122221.forest.car.zst(57Gb) 4613.37 MB 3049918.31 MB
BaseLine Mainnet 8 120 forest_snapshot_mainnet_2023-08-14_height_3122221.forest.car.zst(57Gb)
BaseLine Mainnet 8 1500 forest_snapshot_mainnet_2023-08-14_height_3122221.forest.car.zst(57Gb) 14523.47 MB

sudo-shashank avatar Aug 14 '23 09:08 sudo-shashank

@sudo-shashank What are you measuring?

lemmih avatar Aug 14 '23 10:08 lemmih

@sudo-shashank What are you measuring?

trying to measure memory held during the validate run using ps -o rss= -p "$pid" command in a script

sudo-shashank avatar Aug 14 '23 11:08 sudo-shashank

How many epochs are validating and how many threads are you using?

lemmih avatar Aug 14 '23 12:08 lemmih

(As noted in this issue, memory usage depends entirely on how many threads you're using, so that is vital information you must include in your results.)

lemmih avatar Aug 14 '23 12:08 lemmih

How many epochs are validating and how many threads are you using?

60 epochs now, 8 Threads

sudo-shashank avatar Aug 14 '23 12:08 sudo-shashank

How many epochs are validating and how many threads are you using?

60 epochs now, single core

For calibnet, that should only take a few seconds to evaluate. You'll get better data if you benchmark for longer than a few seconds.

lemmih avatar Aug 14 '23 13:08 lemmih

How many epochs are validating and how many threads are you using?

60 epochs now, single core

When you say a single core, do you mean a single thread? Using a single thread to reproduce a problem that only happens when you use a lot of threads isn't wise.

lemmih avatar Aug 14 '23 13:08 lemmih

How many epochs are validating and how many threads are you using?

60 epochs now, single core

When you say a single core, do you mean a single thread? Using a single thread to reproduce a problem that only happens when you use a lot of threads isn't wise.

I checked the config I am using4 cores and I have 16Gb of RAM available, expected peek RSS for forest snapshot validate was 8Gb(4*2Gib ) but I am getting only 4Gib of peek RSS for a mainnet snapshot validation

sudo-shashank avatar Aug 14 '23 14:08 sudo-shashank

How many epochs are validating and how many threads are you using?

60 epochs now, single core

When you say a single core, do you mean a single thread? Using a single thread to reproduce a problem that only happens when you use a lot of threads isn't wise.

I checked the config I am using4 cores and I have 16Gb of RAM available, expected peek RSS for forest snapshot validate was 8Gb(4*2Gib ) but I am getting only 4Gib of peek RSS for a mainnet snapshot validation

The exact amount of memory used is not important. What is important is how the memory usage scales with the number of threads.

lemmih avatar Aug 14 '23 18:08 lemmih

In my observation so far, the memory usage does not scale with no of threads rather it just scales with no of epochs we validate. More epochs more memory utilisation, peeks to max 15Gib for 1999 epochs of a mainnet snapshot for both mainnet and calibnet

sudo-shashank avatar Aug 23 '23 05:08 sudo-shashank

Moving @sudo-shashank to different tasks.

lemmih avatar Aug 31 '23 08:08 lemmih

I have tried this various times with forest-tool snapshot validate --check-links=0 forest_mainnet.forest.car --check-stateroots=2000

I have noticed that the memory usage depends where we are in the queue.

For example, when I'm at ~1500 stateroots in the queue the memory usage is steady ~12GB and it manages to cleanup extra memory used just fine, but then it seems to start growing again. With 1100 items left in the queue it's about 15GB. So the further down the rabbit hole we go - the more memory is being used.

The auto-detected parallelism is 10 on my machine.

I have tried chunked approach, where MultiEngine is being reinitialised every n items, but that does not seem to have any impact when chunked by 100. It seems like chunks of 20 have a positive impact on memory footprint, but that affects performance more, because we are forced to wait till the current chunk is processed before starting the next one.

ruseinov avatar Nov 15 '23 13:11 ruseinov

I have also tried the approach that initialises an engine for each tipset just to see what that does - that slows things down almost to a halt. I'm going to do memory profiling next to see what exactly is eating up the RAM. I'm concerned that the memory does not get cleaned up properly with the chunked approach and reinitialisation.

ruseinov avatar Nov 15 '23 18:11 ruseinov