starknet-devnet-rs
                                
                                
                                
                                    starknet-devnet-rs copied to clipboard
                            
                            
                            
                        Memory exhaustion
Describe the bug (observed vs expected behavior)
We completed our migration from devnet python to devnet rust, after a few adjustments now we have all tests running again. There is only one issue. Our tests take around 20 min to run, during that time the devnet memory keeps increasing, after 5 minutes it reaches 7.99Gb and it starts to slow down, after another minute the container stops (no specific error i could find)
As a workaround we are calling the restart endpoint every 2 minutes, and that seems to help, although i still see the memory increasing it’s good enough for our test suite (edited)
Not reproducible on alpha-goerli
- [x] This issue is only present on Devnet and cannot be reproduced on alpha-goerli (check the box if true).
 
To Reproduce Steps to reproduce the behavior:
- Keep using devnet rust for 20 minutes
 
Devnet version
- I am using Devnet version: shardlabs/starknet-devnet-rs:64c425b832b96ba09b49646fe0fbf49862c0fb6d
 - [x] This happens with a dockerized Devnet (check the box if true).
 - This does not appear on the following Devnet version:
 
System specifications
- OS: MacOS and also running Github actions on ubuntu
 
I reproduced it with starknet.js tests and I can confirm that behavior, after 3 test runs starknet-devnet-rs takes 22gb of memory but it shouldn't.
@ivpavici suggested checking if it's not accidentally downloading Netflix movies in the background.
@mikiw I think you might be after something with that last idea:
While minding my own business, I was digging through the code and had some ideas about this issue.
- I saw that we clone the state in several places (look for 
state.clone())- Those are refactoring candidates, maybe something is not properly dropped due to our use of 
Arc 
 - Those are refactoring candidates, maybe something is not properly dropped due to our use of 
 - if visual analysis tools weren't of help, there are Rust APIs that return resource usage information
- Could be used for inserting function calls as sort of breakpoints inside the code
 - Thus we could check where exactly memory grows most
 
 - Do we know since when this has been a problem?
- If there is a simple enough testing procedure for the reproduction of this problem (e.g. sending 
Nrequests which eventually cause the memory to grow;Nbeing sufficiently large), it could be reused to backtrack revisions and potentially find one without these issues 
 - If there is a simple enough testing procedure for the reproduction of this problem (e.g. sending 
 
@sgc-code hey, do you need old state history? Like, being able to query old states with an old block_id?
@sgc-code hey, do you need old state history? Like, being able to query old states with an old block_id?
Hi, @FabijanC if we have a flag to disable old history it could workaround our issue for now. We don't currently need to access the older blocks. Maybe you want to look into some more efficient storage options later. Thanks
@sgc-code Sure we are also considering more efficient storage options, but were just considering conditional disabling of state storing as the quickest fix.
Thanks for the reply
So the flag is added here https://github.com/0xSpaceShard/starknet-devnet-rs/pull/290 but I'm still not sure if the current code is right for the flag set to StateArchiveCapacity::Full. I would treat this PR as an enable/disable feature PR but I didn't check if the enabled feature is implemented correctly.
@sgc-code https://github.com/0xSpaceShard/starknet-devnet-rs/pull/290 was merged so now memory should not growing insanely
I suspect the largest part of the state history are contract classes. As has already been mentioned somewhere, storing a clone of state (together with declared classes) for each block is highly inefficient. That's why we are currently defaulting to not storing the state history.
Proposal
Instead of cloning all of the classes, we could have one global storage of type HashMap<ClassHash, (ContractClass, BlockNumber)> (wrapped in e.g. Arc<Mutex<...>> or Arc<RwLock<...>>) - meaning we could store the class together with the block number when it was added.
Why
So cloning a state would only clone the reference (Arc) to the contract class map.
Implementation details
When retrieving a class (in the response of the JSON-RPC method starknet_getClass), we would return the class only if stored_block_number <= query_block_number.
Refactoring might be needed to store a contract class together with the next block number, because currently we store the class before the next block is created.