pywr
pywr copied to clipboard
Issue with the solver GLPK
Hi, It looks like I am having significant issue with GLPK. Long story short - I have a working model that works sufficiently well. I am now trying to run several climate scenarios to assess the climate risk for the considered system.
When running these scenarios (mainly based on a resampling of T and P variables + step changes on these variables), the solver crash in the middle of the simulation, stating that NO PRIMAL SOLUTION EXISTS.
Sometimes, I got the error message below:
Assertion failed: teta >= 0.0 Error detected in file ..\src\simplex\spxchuzr.c at line 294
Looking online about the above assertion, it looks like this might come from numerical instability. Have you encounter such a issue in the past? Thanks
I have only encountered this when a NaN has made it way through as a row bounds or objective coefficient. You could debug by rebuilding by passing --enable-debug to setup.py - make sure it does a complete rebuild though. This should make sure there are some checks on the finite value of all values going to GLPK.
@BaptisteFrancois did you resolve this issue?
@jetuk Sorry - I have been kept busy with traveling and other duties... I will focus on this the next weeks. I'll keep you in informed about how things are going. Thanks.
Fixing #759 (PR in #762) would help with not crashing Pywr in this case.
@BaptisteFrancois Did you ever figure out what the issue was? I'm having the same problem now too.
UPDATE: So in my case it was as @jetuk mentioned above: I had a stray missing value in a CSV file, which resulted in a NaN. This should probably be included in any future error catching. Fortunuately (for us!) googling this error lands us here pretty quickly.
@jetuk @rheinheimer sorry for taking like a year to get back to you... I am still experiencing this issue. This issue does not result from a missing value in an input file. If I rerun the exact simulation it may go through without issue.
@jetuk I am willing to try your suggestion: ebuilding by gpassin --enable-debug to setup.py. However, I am not quite sure how to do this.
I am basically running Pywr through a Python script using firts m.load(my_model) and then m.run().
Could you help me regarding the setup.py code I should pass the enable-debug argument?
Thanks.
You'll need to clone the repository and run a command like this:
python setup.py develop --with-glpk --enable-debug
If you are using Anaconda on Windows you might need to do something like this:
set LIBRARY=%CONDA_PREFIX%\Library
set LIBRARY_INC=%LIBRARY%\include
set LIBRARY_LIB=%LIBRARY%\lib
python setup.py build_ext -I"%LIBRARY_INC%" -L"%LIBRARY_LIB%" --inplace --with-glpk --enable-debug develop
If you do this you should get some assertion errors if there are non-finite values being given to the GLPK update routes. You may get some extra output if there are very small but not zero values being used.
@BaptisteFrancois I get this error on a regular basis (including just now, prompting me to reply), though it's completely unpredictable. I just rerun the model and cross my fingers. Usually it works without problem. Because it's trivial just to re-run the model, I just do so, but might eventually try to debug.
@rheinheimer can you reproduce this at all?
Are both of you using a algorithm/system (e.g. MOEA) that is giving random(ish) inputs to a model and then re-running it?
@jetuk @rheinheimer Apologize ... I have not found time yet to seriously investigate this. On my side, I am not using MOEA. It is also difficult to reproduce because, as described above, the model crash is almost random. Contrary than for @rheinheimer, these random bugs are annoying for me because I am running several runs in parallel through MPI. When one simulation crashes, it cascades to the MPI process.
I think I have diagnosed another issue leading to random a random crash. Specifically, I noticed than :
self.model.nodes['reservoir'].get_level(scenario_index) sometimes returns an infinite value, which makes the model crashes. The crash often happens at the first time step. Note that the max values used within the 'level' and 'storage' attributes, required for the interpolation via the 'get_level' method, are significantly larger than the storage max.
I have not given up installing the developer version of pywr for using the debug mode. I have to try this but just have not found time yet for doing so.
What parameter are you using for the level calculation? Does the issue go away if you use a ConstantParameter instead?
I am not using MOEA or anything random. I cannot purposefully reproduce, other than run the model a few times in a row, and even then, maybe the issue will occur one or more times, maybe not.
My working theory is that the general issue is to do with floating point precision comparison problems when working with a fixed (or what should be fixed) constraint. I think we might be making a doubly bounded constraint with a very tiny range. I'll see about making a PR that we can use to test that theory.
I think the first time-step level issue might be unrelated if there are NaN's involved though.
@BaptisteFrancois I have created a branch (glpk-fixed-con-threshold) with a potential fix as described above. See https://github.com/pywr/pywr/pull/925. Is there any chance you could try this out and see if it helps?
@jetuk sure I can do that.
I noticed that you included the threshold with the pywr/solvers/cython_glpk.pyx . However, I do not have this file in my pywr folder. I only have cython_glpk.cp36-win_amd64.pyd.
Does that mean I am using a previous version of pywr? B.
No, it means it's part of the source code that creates that pyd file. To test this change you'll have to compile Pywr from source unfortunately.
ok got it. I'll try to get this running by tomorrow evening, if not you should hear from me sometimes next week.