memilio Add Safeguards to EulerCore to reduce undefined behavior for SDE models

Motivation / Current Behaviour

As of now SDE models have to implement checks for whether flows/derivatives result in overflows in the compartments (negative compartments or compartments greater than population size). The current implementation can also result in undefined behavior by floating point operations (see Issue https://github.com/SciCompMod/memilio/issues/1009)

Enhancement description

Adapting the EulerCore (potentially writing a EulerMaruyamaCore) one can use vectorized functions to add those safeguards and other modifications into the core. Potential modifications are:

Clipping compartments to the intervall [0, population_size]
Setting compartments below some threshhold to zero
Rescaling compartments in case clipping changed total population size

Additional context

No response

Checklist

[X] Attached labels, especially loc:: or model:: labels.
[X] Linked to project

Dec 17 '24 11:12 nijawa

I have proposed a possible "new" EulerCore in the linked branch, before opening a pull request I would like to discuss whether this should be moved into a new core

Dec 17 '24 12:12 nijawa

@reneSchm any expert opinion how Nils could proceed?

Dec 17 '24 12:12 mknaranja

I assume you mean NaNs in simulation results by "undefined behaviour"? While they are not great, these NaNs are perfectly well defined behaviour in our case.

I am not convinced that the IntegratorCore is a suitable place for this. The main issues I see are:

Placing these safeguards in the IntegratorCore mixes responsibilities and creates more interdependence. The Integrator is perfectly fine with negative values, and will not produce any unless they come from the model (or more accurately, the integrand aka. DerivFunction).
I am not sure that it is desirable to rescale integration results. It seems to me like a band-aid fix to an issue with the Model/DerivFunction, that may cause issues when used with different models later.
Does your EulerCore work with flow models? The flows (used as yt) are initialised as 0, so if I read this correctly, the integration result will be const. 0.

An alternative may be to use a new DerivFunction to compute the stochastic component from each flow and add it to that flow, as well as applying the safeguards. This DerivFunction can be used by a new Simulation class, or we can discuss adding it in some form to the FlowSimulation.

I have a experimental branch here. It does not have your safeguards yet, but I think they can be added around the calculation of m_pop or around the compuation of the stochastic components. Can you maybe try to use this with a model that is currently causing issues, so we can try and discuss fixes? That is, can you implement the "get_flow_denoised" function, which just returns the flows without clamping or adding a stochastic component.

Dec 17 '24 13:12 reneSchm

Hi René thank you for your insight. I will implement a model with 3 variants and 4 different immunity states as an example.

Dec 17 '24 15:12 nijawa

I have implemented two models sde_ui3 and sde_ui10, reflecting a SIS model with virus mutation (the first with 3 variants and 4 immunity states, the second with 10 variants and 11 immunity states). It seems to work quite well but I have yet to write tests.

One thing I have noticed is that the 10 variant model is very demanding during build with cmake. My setup was barely able to finish compilation with 32 GB RAM and 100 GB swap file so I excluded the model from the cmake files in the branch.

Dec 31 '24 01:12 nijawa

The ui10 model is demanding because of the ~400 infection states, as the compile time map for indices is optimized for runtime, but not compile time. Looking at the get_flows in the ui10 model you can look into making the virus variant a population category, similar to AgeGroups in the other ODE models (e.g. the SEIR or SECIR models), which use a ContactMatrix for interaction parameters between AgeGroups. I think the immunity states could be a category as well, at a glance their calculation only differs by the parameters used. If you would like some help with that, you can message me on mattermost or by mail.

I played around a little with the models, and without the factor 1 - 1e-4 or similar on the upper bound for the flows, they ~~play the batman theme~~ produce NaNs really quickly, in around five time steps. While we do need to guard against this, is there maybe another issue with the calculation or the parameter choice? Even with this guard / factor, the model seems to quickly converge to a steady state, since the flows are almost constant for most of the time.

Jan 07 '25 10:01 reneSchm