PyBaMM
PyBaMM copied to clipboard
Excessive RAM consumption when running multiple simulations (Solver Issue)
Problem description
When using the same solver object for multiple simulations, memory consumption increases every simulation. The issue is triggered by the solver object. (If this is desired behaviour, please close the issue.)
This minimal example triggers the Issue
import pybamm as pb
import gc
solver = pb.CasadiSolver(mode="safe")
para = pb.ParameterValues(chemistry=pb.parameter_sets.Chen2020)
model = pb.lithium_ion.SPM()
exp = pb.Experiment(["Discharge at 0.2C until 2.9V (1 minutes period)"])
for i in range(30):
print(i)
sim = pb.Simulation(model, experiment=exp, parameter_values=para, solver=solver)
sim.solve()
del sim
gc.collect()
This minimal example is a workaround for the Issue
import pybamm as pb
import gc
para = pb.ParameterValues(chemistry=pb.parameter_sets.Chen2020)
exp = pb.Experiment(["Discharge at 0.2C until 2.9V (1 minutes period)"])
model = pb.lithium_ion.SPM()
for i in range(30):
print(i)
solver = pb.CasadiSolver(mode="safe") # creating solver object every iteration
sim = pb.Simulation(model, experiment=exp, parameter_values=para, solver=solver)
sim.solve()
del sim, solver # deleting object every iteration frees the space
gc.collect()
I have never done much with memory profiling. Do you know a workflow for this?
Ok, I think I understand what is going on (but not 100% sure)
What's happening
- When the simulation solves the model, it has to first set parameters and discretize. When doing this, it creates a new copy of the model, so that the model that you pass in is not unexpectedly changed
- When solving, the solver first performs a "set up" step where it does some preprocessing on the model (calculating jacobians, converting to casadi, etc). To avoid having to do this every time the same model is solved, it saves which models it has already preprocessed in a dictionary
self.models_set_up
. - Hence in each iteration of the loop the solver sees a new model (since the simulation creates a new model each time) and hence is performing the "set up" step each time, and storing the models. These are the really memory-expensive objects. They are then getting stored in the solver and not removed by garbage collection when the simulation is deleted (unless the solver is also deleted).
** Workaround **
I don't think it's worth doing anything about this at this stage. Depending on your use case some possible fixes are:
- Define the simulation outside the for loop so that the discretization only happens once (this will get you a big performance gain anyway)
- If that's not possible, and you don't want to delete the solver each time, clear which models are saved in the solver:
solver.models_set_up = {}
Let me know if either of these fix the excessive RAM consumption issue for you
I quickly tried your solution:
def workaround_sulzer_01():
model = pb.lithium_ion.SPM()
para = pb.ParameterValues(chemistry=pb.parameter_sets.Chen2020)
exp = pb.Experiment(["Discharge at 0.2C until 2.9V (1 minutes period)"])
solver = pb.CasadiSolver(mode="safe")
for i in range(1000):
sim = pb.Simulation(model, experiment=exp, parameter_values=para, solver=solver)
sim.solve()
solver.models_set_up = {}
print(f"cycle {i:>4}, solver_size: {sys.getsizeof(solver):>4}B")
Unfortunately, the RAM consumption continues to increase with this workaround.
Do you think it costs me a lot of computing time to initialise a new solver every time? With this solution, the Issue gets solved (see second code example in the first post).
What about defining the simulation outside the for loop? Or is that not possible for your use case?
I am updating the model every iteration, based on data from a PV System. It is a kind of lifetime simulation for PV batteries based on real data and I feed my data on an hourly basis as an experiment.
But even if I define the simulation outside, this would not solve the issue with the solver. RAM consumption is certainly increasing due to the solver and not the simulation.
If it helps, I can check which variable in the solver is causing the increase.
It would still be good to check what happens if you put the simulation outside the for loop. The simulation creating a new model each time might be the problem.
We have pybamm.InputParameter
which allows you to do things like you are doing more efficiently (without redefining the model each time). You make something an input parameter instead of a regular Parameter
or Scalar
and then you can specify its value at solve time. If you want to DM me a MWE I can help set this up
Yes, would be great to pinpoint the variable in the solver that is causing issues.
There are a couple more storage variables in the casadi solver solver.integrators
and solver.integrator_specs
, for me this worked a lot better, but there is still a slow increase in RAM usage
.
model = pb.lithium_ion.SPM()
para = pb.ParameterValues(chemistry=pb.parameter_sets.Chen2020)
exp = pb.Experiment(["Discharge at 0.2C until 2.9V (1 minutes period)"])
solver = pb.CasadiSolver(mode="safe")
for i in range(1000):
sim = pb.Simulation(model, experiment=exp, parameter_values=para, solver=solver)
sim.solve()
solver.models_set_up = {}
solver.integrators = {}
solver.integrator_specs = {}
print(f"cycle {i:>4}, solver_size: {sys.getsizeof(solver):>4}B")
Ah yeah I forgot about those. How did you generate that plot?
I used mprof
https://pypi.org/project/memory-profiler/
Any update on this @huegi ?
def workaround_RAM_loss():
model = pb.lithium_ion.SPM()
para = pb.ParameterValues(chemistry=pb.parameter_sets.Chen2020)
exp = pb.Experiment(["Rest for 5 minutes"])
solver = pb.CasadiSolver(mode="safe")
sim = pb.Simulation(model, experiment=exp, parameter_values=para, solver=solver)
for i in range(100):
sim.solve()
sim.set_up_experiment(
pb.lithium_ion.SPM(), pb.Experiment(["Rest for 5 minutes"])
)
sim.solver.models_set_up = {}
sim.solver.integrators = {}
sim.solver.integrator_specs = {}
print(f"cycle {i:>4}")
Running the Code above without resetting the three solver variables results in:
Running the code with resetting those 3 variables works totally fine:
Then I tried to find the variable(s) which is causing the problem:
Perhaps one of you can explain why the memory is only released when all three variables are deleted.
PS: If you select a longer experiment, the memory consumption increases in smaller steps as in @martinjrobins picture. So you could probably also free up memory there.
It could be that the three variables reference the memory in the others, so deleting one has no effect as the memory is still referenced somewhere (this is a guess!). I agree that there is still a (small) memory leak somewhere. This may be specific to the casadi solver, as I also tried the scipy solver and did not see a memory increase over time.
Found out that the method solver.step triggers the additional memory leak.
I was able to track it down to the base_solver class. I see that this func is a vector. But I can't find it afterwards. Could it be that wrong memory associations are made there from time to time (it doesn't happen every cycle!)?
Perhaps one of you can explain why the memory is only released when all three variables are deleted
All three dictionaries have the model as their key, so the model's memory only gets released when all three are deleted
I was able to track it down to the base_solver class. I see that this func is a vector. But I can't find it afterwards. Could it be that wrong memory associations are made there from time to time (it doesn't happen every cycle!)?
I know nothing about memory leaks, but plausible. This func
object is returned by the process
function, so it becomes rhs
, algebraic
, etc. However in https://github.com/pybamm-team/PyBaMM/blob/develop/pybamm/solvers/base_solver.py#L400-L402 it doesn't assign the func
to anything, so maybe that is the problem? Can you try with model.events = []
to check?
Thank you for your idea with model.events = []
@tinosulzer
I added it here in the base_solver. Now it seems to free up the memory after every iteration correctly.
Funny how Python deals with unallocated memory in this case. I have never seen anything like that. But I'm also relatively new to the game ;)
def workaround_RAM_loss():
sim = pb.Simulation(
model=pb.lithium_ion.SPM(),
experiment=pb.Experiment(["Discharge at 0.2C until 3.4V"]),
parameter_values=pb.ParameterValues(chemistry=pb.parameter_sets.Chen2020),
solver=pb.CasadiSolver(mode="safe"),
)
for i in range(100):
sim = myCycleFunc(sim)
print(f"cycle: {i}")
def myCycleFunc(sim: pb.Simulation):
sim.solver.models_set_up = {}
sim.solver.integrators = {}
sim.solver.integrator_specs = {}
sim.set_up_experiment(
pb.lithium_ion.SPM(), pb.Experiment(["Discharge at 0.2C until 3.4V"])
)
sim.solve()
# --> normally I would analyze the results here
return sim
Do you think that we have now found all the factors that have led to an increase in memory? The latter issue with model.event
can probably be included in the code.
And regarding the first issue in the solver. What would you think of a solver.clean method that deletes the three dicts?
Can you try replacing https://github.com/pybamm-team/PyBaMM/blob/develop/pybamm/solvers/base_solver.py#L400-L402 with
_, event_eval, _ = process(
event.expression, f"event_{n}", use_jacobian=False
)
?
Maybe the allocation to _
will force it to reuse that memory? Removing model.event
is not a good general solution to put in base_solver
as in some cases we do need the casadi form of the events.
Also, I will stress again that you're inducing some (probably unnecessary) overheads by running sim.set_up_experiment
inside the for loop.
And feel free to add a solver.reset()
function which clears those dictionaries
I'm using sim.set_up_experiment
because I am trying to emulate the behaviour of my actual code. There I'm running new experiments on an hour base and therefore I have to update the model and set_up_experiment
. I've updated my minimal example, so that it makes more sense:
def workaround_RAM_loss():
model = pb.lithium_ion.SPM()
exp_charge = pb.Experiment(["Charge at 0.2C for 30 min or until 4V"])
exp_discharge = pb.Experiment(["Discharge at 0.2C for 30 min or until 3V"])
sim = pb.Simulation(
model=model,
experiment=pb.Experiment(["Discharge at 0.2C for 120 min or until 3V"]),
parameter_values=pb.ParameterValues(chemistry=pb.parameter_sets.Chen2020),
solver=pb.CasadiSolver(mode="safe"),
)
sim.solve()
for i in range(1000):
model.set_initial_conditions_from(solution=sim.solution, inplace=True)
sim.reset() # new proposal to fix the issue
if i % 2 == 0:
sim.set_up_experiment(model, exp_charge)
else:
sim.set_up_experiment(model, exp_discharge)
sim.solve()
last_voltage = sim.solution["Terminal voltage [V]"].entries[-1]
print(f"cycle: {i}; last_voltage: {last_voltage}")
In the following you can find the different memory profiles for 100 cycles:
As you can see it needs the model.events = []
expression to fix the problem.
I think the easiest to implement is a method in the Simulation class that resets the three solver parameters and the events of the model.
def reset(self):
self.solver.models_set_up = {}
self.solver.integrators = {}
self.solver.integrator_specs = {}
self.model.events = []
On this last image you can see whats possible with this reset method (@1000 cycles). There is still some memory leakage but I think less enough to run all applications without problems.
Removing the events of the model is not the way to go, it will almost certainly break something else.
Actually your reset
function is not removing the events, since self.model
is a copy of model
and you pass the original model
back in when you set up the experiment. So the memory leak in the last plot is still from the events
i encounter similar problem when solve with parameter "starting_solution", memory increase to 7.2GB while loop 574. i tried same workaround here, but no help
@chuckliu1979 can you post your code?
@chuckliu1979 can you post your code?
sure
import time
import numpy
import pybamm
def constant_t_eval():
return numpy.linspace(0, 5760, 576)
def constant_var_pts():
var = pybamm.standard_spatial_vars
return {var.x_n: 20, var.x_s: 20, var.x_p: 20, var.r_n: 10, var.r_p: 10}
class PVExperiment(pybamm.Simulation):
def __init__(
self,
model=None,
experiment=None,
parameter_values=None,
var_pts=None,
solver=None,
options=None,
power_supply=None,
electrodes=None,
cells=None,
):
self._t_eval = None
options = options or {"thermal": "lumped"}
model = model or pybamm.lithium_ion.DFN(options=options)
var_pts = var_pts or constant_var_pts()
solver = solver or pybamm.CasadiSolver(mode="safe")
solver.max_step = 100000
parameter_values = parameter_values or pybamm.ParameterValues(chemistry=pybamm.parameter_sets.Chen2020)
parameter_values.update(
{
"Number of electrodes connected in parallel to make a cell": electrodes or 4,
"Number of cells connected in series to make a battery": cells or 1,
},
check_already_exists=False,
)
if experiment is None:
parameter_values.update(
{
"Power function [W]": power_supply or 39.620,
},
check_already_exists=False,
)
super().__init__(model=model, parameter_values=parameter_values, experiment=experiment, var_pts=var_pts, solver=solver)
@property
def t_eval(self):
if self._t_eval is None and self.solution is not None:
self._t_eval = self.solution["Time [s]"].entries
return self._t_eval
def solve(
self,
t_eval=None,
solver=None,
check_model=True,
save_at_cycles=None,
starting_solution=None,
**kwargs,
):
self._t_eval, t_eval = None, None
return super().solve(
t_eval=t_eval,
solver=solver,
check_model=check_model,
save_at_cycles=save_at_cycles,
starting_solution=starting_solution,
**kwargs,
)
if __name__ == '__main__':
solution = None
parameter_values = None
for i in range(0, 574):
print(f"loop {i}:")
parameter_values = parameter_values or pybamm.ParameterValues(chemistry=pybamm.parameter_sets.Chen2020)
vmin = parameter_values["Lower voltage cut-off [V]"]
sim = PVExperiment(experiment=pybamm.Experiment([tuple([f"Discharge at 39.620 W for 10 seconds or until {vmin} V"])]), parameter_values=parameter_values)
solution = sim.solve(starting_solution=solution)
if solution is not None:
print(solution["Terminal voltage [V]"].entries)
print("Sleep...")
time.sleep(600)
Hi - I've been looking into this issue today and a major (but apparently not the only) problem appears to have been caused by a failure to release both the Simulation and ElectrodeSOHSolver instances for garbage collection due to the use of lru_cache decorators in simulation.py and full_battery_models/lithium_ion/electrode_soh.py. Removal of the decorators allows garbage collection to occur as normal.
The lru_cache containers can instead be retained by wrapping the functions during class construction, which makes the containers local to the instance (and hence are destroyed when the instance goes out of scope, or is marked for deletion). Here's a nice explanation of the issue that I found while searching for a suitable solution: https://rednafi.github.io/reflections/dont-wrap-instance-methods-with-functoolslru_cache-decorator-in-python.html.
As demonstration, here are some memory_profiler results for @huegi's original minimal example (slightly updated due to deprecation of the 'chemistry' argument in ParameterValues).
Code:
import pybamm as pb
import gc
solver = pb.CasadiSolver(mode="safe")
para = pb.ParameterValues('Chen2020')
model = pb.lithium_ion.SPM()
exp = pb.Experiment(["Discharge at 0.2C until 2.9V (1 minutes period)"])
for i in range(30):
print(i)
sim = pb.Simulation(model, experiment=exp, parameter_values=para, solver=solver)
sim.solve()
del sim
gc.collect()
Using decorators:
Using instance wrappers:
Note that memory consumption increases much more slowly, and drops from a peak of ~310Mb to ~180Mb. This does not entirely resolve the problem, but I can confirm that the destructors are now being called on each loop.
There is also one other use of lru_cache as an instance decorator in the repository (expression_tree/symbol.py).
Thanks @jsbrittain, nice catch. You would think the lru_cache docs would put some sort of warning on that decorator
@martinjrobins I remember implementing @lru_cache
earlier when I was memoizing those methods in #2465: iirc, it was used because @cache
was introduced in Python 3.9, so it would break compatibility because we still support Python 3.8. I am not sure if @cache
's implementation for maxsize
is different in this case, because it is essentially running @lru_cache(maxsize=None)
as a wrapper internally
@jsbrittain's solution of wrapping the functions during class construction doesn't seem to have a downside (apart from an extra line of code), sounds like we should do this for all the instances of @lru_cache
unless anyone has objections...
Nice catch @jsbrittain , happy to revert the change to get rid of @lru_cache
. Though it doesn't completely fix this issue (which predates @lru_cache
) since the memory shouldn't increase at all in this example
@martinjrobins @tinosulzer - yes, as you say this is not the only issue...
The other issue arising from @huegi's minimal sample comes from the solver (as was originally identified), due to an accumulation of items in the integrators
and integrator_specs
dictionaries. As others have suggested, resetting these (or constructing a new solver on each iteration of the loop) resolves the problem (image below run for 100 iterations). One possible solution here would be to implement a flag that discards old iterators, possibly by discarding the least recently used items (much like the LRU cache implementatons used elsewhere).

@chuckliu1979's code sample on the other hand produces a growing solution
object on each iteration, which is then captured by sim
through the starting_solution
parameter, so both sim
and solution
increase in size steadily over time (solution
contains a number of lists which grow linearly with each successive iteration, as does solution._summary_variables
). Discarding old estimates on each iteration of the loop seems to resolve the problem (below are results from the original code; and truncating the lists to the most recent 2 items):


However, since having the ability to iteratively run simulations seems desirable, one option would be to provide a user-adjustable limit on the length / history of these lists, and could default so that only the current and previous values sets are retained, for example keep_history=1
. Thus it seems that limiting the history of the solvers is likely to resolve both problem cases. Note that this suggestion also appears related to #2796.
Summarising:
- I would recommend associating the
lru_cache
methods with the class instance instead of the class object, since this seems like the original desired behaviour. - Should old iterators be discarded past some limit? This seems reasonable.
- Should the solution history be discarded, past some limit? This also seems reasonable. However, I am not familiar enough with the codebase just yet to have a good feel as to whether this would prove problematic in any use-cases. I can certainly implement these changes and run tests to find out.
Summary point 3 (above) is also likely to be resolvable by forwarding solution.last_state
as the starting_solution
(instead of the full solution profile); however, this would first require resolution of oustanding issue #2788 .
I agree points 1 and 2 should be resolved as you suggest.
For point 3, I would prefer to keep the full history by default, and document that for RAM-intensive cases, the user should pass in starting_solution.last_state
(after we fix #2788). i.e. if the user passes in starting_solution
, not truncating to starting_solution.last_state
, and leaving that truncation up to the user if they so desire.
To give some more context, solving a model gives the outputs t
and y
, but often there are other variables of interest f(t,y)
. The way solutions currently works is by only storing t
and y
as default, and evaluating any variables lazily when that variable is called in the Solution
dictionary by evaluating f(t[idx], y[:,idx])
for all idx
. Therefore, it is important to keep the full history of t
and y
so that users have the possibility of evaluating any variable over the entire history.
The exception will be when running as described in #2796 , in which case we know which variables to save and won't be saving solution.y
anyway (though we should keep solution.last_state
in that case for initializing future solutions.
Another bug is that Solution
contains Solution.cycles
, a list of sub-solutions, and Solutions.cycles[idx].steps
, a list of sub-steps. This allows users to easily call a specific step e.g. solution.cycles[3].steps[0]
. However, the solution, cycles, and steps all hold a copy of the y
matrix, which means (for large y
matrices) the memory usage is 3x larger than in needs to be. Using views instead would alleviate this problem (can that be done in Python?)