pysteps icon indicating copy to clipboard operation
pysteps copied to clipboard

Add Kalman-filter based post-processing to blending scheme

Open RubenImhoff opened this issue 1 year ago • 12 comments

As discussed at the start of our blending implementation and re-discussed during this year's EGU, we can still improve the post-processing that takes place in the blending procedure. Right now, the post-processing procedures from the nowcasting module are used and adjusted to work within the blending module. This is functional, but has as limitation that the post-processing is based on (1) having everything in Lagrangian coordinates and that is not the case in the blending procedure (we have incorporated that difference, but is suboptimal), and (2) it depends on the weights assigned to the NWP and extrapolation component, which could reduce extremes.

A Kalman-filter like method, such as in Nerini et al. (2019; https://doi.org/10.1175/MWR-D-18-0258.1), could do the trick. I think we should see if we can implement this in the blending module anytime soon. :)

RubenImhoff avatar Apr 23 '24 13:04 RubenImhoff

Thanks for initiating this @RubenImhoff ! Can you provide any hint on where we should make changes in the current code? I can then try to make a first draft.

dnerini avatar Apr 26 '24 11:04 dnerini

Sorry for completely forgetting to respond to this, @dnerini! I will dive into the code and come back to your question early next week. Would be great to put this feature into the code. :)

RubenImhoff avatar Jul 12 '24 12:07 RubenImhoff

Hi @dnerini, To finally come back to your question:

This is where the probability matching is called in the code: https://github.com/pySTEPS/pysteps/blob/953f799f932760dd5775ce4f9f0fc771e4d0a67d/pysteps/blending/steps.py#L1494.

The CDF probmatching method still calls this original function: https://github.com/pySTEPS/pysteps/blob/953f799f932760dd5775ce4f9f0fc771e4d0a67d/pysteps/postprocessing/probmatching.py#L54

And maybe most important, as the fields are not in Lagrangian coordinates (whereas this is the case in the nowcasting module until the probability matching part), we perform the post-processing on a blended extrapolation-NWP field (that is, excluding the noise cascade) to get some sort of indication of where the rainfall fields would be at that timestep and what they would look like given the original input files and no added noise: https://github.com/pySTEPS/pysteps/blob/953f799f932760dd5775ce4f9f0fc771e4d0a67d/pysteps/blending/steps.py#L1431.

Is this something you can work with? Let me know how I can assist, happy to help!

RubenImhoff avatar Jul 17 '24 07:07 RubenImhoff

Another potential post-processing issue we have is that the incremental mask is not really incrementing over time, see https://github.com/pySTEPS/pysteps/blob/953f799f932760dd5775ce4f9f0fc771e4d0a67d/pysteps/blending/steps.py#L1471. This is also caused by the lack of Lagrangian coordinates, which makes the implementation different from the nowcasting approach. Anyway, any thoughts here are welcome too. :)

RubenImhoff avatar Jul 17 '24 07:07 RubenImhoff

There are 2 separate issues here: 1) create a new target distribution for the resampling of the precipitation (instead of probability matching with a blended nowcast, sample from both NWP & nowcast according to the current skill to weight the sampling probability) 2) Kalman filter (separate issue)

ladc avatar Jul 18 '24 09:07 ladc

simple code to resample the distributions:

    def resample_distributions(a, b, weight):
        '''merge two distributions'''

        assert a.size == b.size
        asort = np.sort(a.flatten())[::-1]
        bsort = np.sort(b.flatten())[::-1]
        n = asort.shape[0]

        # resample
        idxsamples = np.random.binomial(1, weight, n).astype(bool)
        csort = bsort.copy()
        csort[idxsamples] = asort[idxsamples]
        csort = np.sort(csort)[::-1]
        return csort

@RubenImhoff maybe you can help finding where this could fit into the pysteps blending code, and also which value to use for weight

dnerini avatar Jul 18 '24 09:07 dnerini

This is the idea behind the "Empirical distribution matching" (section 3.3) in this paper. So in principle this part of the code can be included in the postprocessing folder? So then it can be an option as cdf, etc.

aitaten avatar Jul 18 '24 10:07 aitaten

@dnerini, yes, it concerns these weights: https://github.com/pySTEPS/pysteps/blob/162985942dbd45be4d88c0f84a946e0501d71f3c/pysteps/blending/steps.py#L1405.

RubenImhoff avatar Jul 18 '24 10:07 RubenImhoff

I agree with @aitaten that this would fit well in the post-processing and that we then call the method just like we call "cdf" now as prob_matching method.

RubenImhoff avatar Jul 18 '24 10:07 RubenImhoff

I agree with @aitaten that this would fit well in the post-processing and that we then call the method just like we call "cdf" now as prob_matching method.

Not sure about this. I see the resampling as an additional step you would do before the prob matching , which you still need, so to construct a new target distribution

dnerini avatar Jul 18 '24 10:07 dnerini

Ah that way, I see what you mean. In any case, it will only be used by the probability matching in the end, right? If so, it would make sense to place that function there too.

RubenImhoff avatar Jul 18 '24 11:07 RubenImhoff

@dnerini, I have opened a draft pull request that we can work in.

RubenImhoff avatar Jul 18 '24 14:07 RubenImhoff