linopy
linopy copied to clipboard
Serializing and deserializing linopy.Model
Hi,
I am currently exploring/trying to setup larger models in parallel (in individual processes) and pass them back to the main process. Because the individual models are fairly large but can be prepared individually and largely independend from each other. Later on linked specific instances are linked through a few additional constraints.
However, although serializing into a pickle or dill works fine, when trying to serialize the pickle again, a recursion error is thrown and therefore, ProcessPoolExecutor cannot be used to prepare models in parallel. (I.e., ProcessPoolExecutor uses serialization to hand over data from one process to another) This can be easily checked with this example:
import dill
import pandas as pd
import linopy
import pickle
m = linopy.Model()
time = pd.Index(range(10), name="time")
x = m.add_variables(
lower=0,
coords=[time],
name="x",
) # to be done in parallel process
y = m.add_variables(lower=0, coords=[time], name="y") # to be done in parallel process
factor = pd.Series(time, index=time) # to be done in parallel process
con1 = m.add_constraints(3 * x + 7 * y >= 10 * factor, name="con1") # to be done in parallel process
con2 = m.add_constraints(5 * x + 2 * y >= 3 * factor, name="con2") # to be done in parallel process
m.add_objective(x + 2 * y) # to be done in parallel process
with open("test.pkl", 'wb') as f:
dill.dump(m, f)
with open("test.pkl", 'rb') as f:
m2 = dill.load(f)
x.lower = 1 # or add whatever additional constraint
m.solve()
Which throws the following error:
Traceback (most recent call last):
File "C:\github\test\linopy\test.py", line 29, in <module>
m2 = dill.load(f)
^^^^^^^^^^^^
File "C:\github\test\.venv\Lib\site-packages\dill\_dill.py", line 289, in load
return Unpickler(file, ignore=ignore, **kwds).load()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\github\test\.venv\Lib\site-packages\dill\_dill.py", line 444, in load
obj = StockUnpickler.load(self)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\github\test\.venv\Lib\site-packages\linopy\variables.py", line 1149, in __getattr__
if name in self.data:
^^^^^^^^^
File "C:\github\test\.venv\Lib\site-packages\linopy\variables.py", line 1149, in __getattr__
if name in self.data:
^^^^^^^^^
File "C:\github\test\.venv\Lib\site-packages\linopy\variables.py", line 1149, in __getattr__
if name in self.data:
^^^^^^^^^
[Previous line repeated 745 more times]
RecursionError: maximum recursion depth exceeded
@tburandt thanks for raising the issue. that's quite unfortunate. Pickling is not tested atm. how about storing it as netcdf in the meanwhile? should be as fast as pickling
This is most likely also the reason for deepcopy issues within PyPSA on some networks. I had a look into this a while ago, but this is a better starting point, so I will check again.
I have the vague feeling that the get_item and getattribute overrides could be related to this...
@FabianHofmann the problem is that multiprocessing and ProcessPoolExecutor (from concurrent.features) for example use pickle (or dill, i am not sure) to handover objects either from one process to another or back to the main process.
For storing the model manually, I can try netcdf. I might have some idea to solve my problem with that at least :)