EMAworkbench icon indicating copy to clipboard operation
EMAworkbench copied to clipboard

ValueError: cannot convert float to NaN

Open quaquel opened this issue 2 years ago • 0 comments

In recent weeks, various people have been complaining about getting a ValueError: cannot convert float to NaN.

The error is triggerd in lines 285-288 in callback.py. Note that until https://github.com/quaquel/EMAworkbench/commit/ba4abbbbaa68d56ea604a242bb721fcd65b39afb, this same code was used in both the __init__ and _store_outcome. Commit https://github.com/quaquel/EMAworkbench/commit/ba4abbbbaa68d56ea604a242bb721fcd65b39afb was just a first step towards issolating the problem and investigating it further.

What is strange about this error is that lines 285-288 in callback.py the workbench where it originates has been in the workbench since 27 March 2019 and has been part of the 2.0 release in April 2019. Until the last month or so, I have never seen this error before nor received questions about it. My hunch is that something in NumPy related to the reworking of dtypes is now triggering this error.

So what is going on? The code itself is actually wrong. If you initialize a NumPy array with any dtype other than float, you will get a ValleError because of np.NaN is a float and cannot be assigned to a non-float NumPy array.

What is the motivation for this code? The key idea is to set up NumPy arrays in which to store the results and set some value on them that is instantly recognizable as a missing value if a given experiment fails. A simple use case is when using VensimModel. If an experiment triggers a floating-point error in Vensim, this raises a CaseError, but the experiments will continue. Thus there can be failed experiments, and the purpose of using NaNs is to make this clear if you return to stored results after a while. However, since np.NaN is a float; it cannot serve as a general solution.

So what other solutions do we have for this problem?

  1. have some custom flag for floats, integer, and object that is used by the workbench to flag missing values. The problem with this is that seperate flags will be needed for any conceivable dtype. Moreover, what to pick as a sensible flag?
  2. Use numpy.ma. Numpy.ma provides masked arrays that act as normal numpy arrays but come with a mask for missing or invalid entries. The drawback is that this potentially requires a bigger code overhaul, depending on the desired behavior. Basically, how should the workbench handle failed experiments?
  3. Shift to pandas for storing the outcomes, and use pd.NA. However, this is still an experimental feature and seems to be primarily added to pandas because of NumPy's lack of an integer equivalent for NaN.

I am inclined toward solution 2, but I am not convinced yet. Also, there might be other possible solutions that I have not thought of yet.

In my view, the workbench should be able to execute experiments and continue even if individual experiments fail. In particular, when scaling the workbench to use HPC resources, this is desirable behavior unless many experiments fail.

However, once finished with the experiments, the workbench should make it easy to identify which experiments have failed. Having masked arrays makes this easy. Take the mask from a given outcome, apply it to the experiment's data frame, and you know which experiments failed.

Ideally, the workbench would even log the failed experiments in some easy-to-read overview and do the same when loading results that contain masked elements in the outcomes. If no experiments fail, it is easy to unmask the outcomes, turning them into regular NumPy arrays.

When using optimization, failed experiments should trigger a failure in the optimization.

quaquel avatar Jun 13 '22 18:06 quaquel