pysteps icon indicating copy to clipboard operation
pysteps copied to clipboard

Parametric pysteps and a database backend

Open alanseed opened this issue 6 months ago • 9 comments

I have completed a prototype of a parametric version of pysteps that uses mean and standard deviation of raining pixels (dBr), raining fraction, piecewise linear power spectrum, and a dynamic scaling model that is based on the correlation time at a reference scale to generate the stochastic fields that are needed for the nowcasts. Blending with nwp is done by blending the parameters rather than the cascade fields. Radar only nowcasting is done by keeping the parameters fixed through the forecast lead time. The model is suitable for limited area domains, generally less than 500 km, where the nwp skill in predicting the spatial and temporal distribution of sub-hourly rain rates inside the domain is negligible.

I am using MongoDB as a backend to manage the data, parameters, configuration, and output product generation. This has simplified configuration management, and allowed me to develop a modular approach to generating the nowcasts. For example loading the input radar and NWP fields, calculating the parameters, generating the ensembles, and assembling the output products for the end users are all separate applications that interact with the database. All rainfall fields are indexed by domain id, product id, valid time, base time, ensemble number and are stored as CF compliant netCDF files in the database. The latter two indices are set to None for non-ensemble products.

My science questions are related to methods to predict NWP skill at forecasting these STEPS parameters at lead times greater than 3 hours and methods to optimally select NWP ensemble members so as to increase the resolution of the output STEPS ensembles. There is also the interesting question of how to generate stochastic time series of the 6 STEPS parameters that are conditioned on the observations and forecasts.

Happy to share the code and will need some help in setting it up as a pysteps branch before the workshop.

Alan

alanseed avatar May 29 '25 23:05 alanseed

Hi @alanseed, we have already been in contact about your work, of course, but very interesting contribution! I expect that we'd all be happy to assist in including it in the pysteps code and learn how it improves and/or provides an alternative to the current (blending) approaches. :)

RubenImhoff avatar May 30 '25 15:05 RubenImhoff

hi Ruben,

Thanks for the encouraging reply. I have created the pysteps_param branch in my pysteps repo and would like to push it to you, how do I do this? I have put the code into two new sub-directories, mongo and param.

It is very rough at the moment, just a port of the work that I have been doing and is a mixture of executable and modules I am afraid.

Regards Alan

On Sat, 31 May 2025 at 01:10, Ruben Imhoff @.***> wrote:

RubenImhoff left a comment (pySTEPS/pysteps#472) https://github.com/pySTEPS/pysteps/issues/472#issuecomment-2922658176

Hi @alanseed https://github.com/alanseed, we have already been in contact about your work, of course, but very interesting contribution! I expect that we'd all be happy to assist in including it in the pysteps code and learn how it improves and/or provides an alternative to the current (blending) approaches. :)

— Reply to this email directly, view it on GitHub https://github.com/pySTEPS/pysteps/issues/472#issuecomment-2922658176, or unsubscribe https://github.com/notifications/unsubscribe-auth/AU3LMWNOOXL4NJV2MM6F6ET3BBYFFAVCNFSM6AAAAAB6GW337WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDSMRSGY2TQMJXGY . You are receiving this because you were mentioned.Message ID: @.***>

alanseed avatar May 31 '25 00:05 alanseed

Actually Ruben,

Here is the code as a zip file, it is in the Weather Radar New Zealand git repo, not in my personal repo, and I need to work out how to set it up so that we can collaborate on it in a structured way.

Regards Alan

On Sat, 31 May 2025 at 01:10, Ruben Imhoff @.***> wrote:

RubenImhoff left a comment (pySTEPS/pysteps#472) https://github.com/pySTEPS/pysteps/issues/472#issuecomment-2922658176

Hi @alanseed https://github.com/alanseed, we have already been in contact about your work, of course, but very interesting contribution! I expect that we'd all be happy to assist in including it in the pysteps code and learn how it improves and/or provides an alternative to the current (blending) approaches. :)

— Reply to this email directly, view it on GitHub https://github.com/pySTEPS/pysteps/issues/472#issuecomment-2922658176, or unsubscribe https://github.com/notifications/unsubscribe-auth/AU3LMWNOOXL4NJV2MM6F6ET3BBYFFAVCNFSM6AAAAAB6GW337WVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDSMRSGY2TQMJXGY . You are receiving this because you were mentioned.Message ID: @.***>

alanseed avatar May 31 '25 00:05 alanseed

@RubenImhoff @dnerini I have forked pysteps and pushed the pysteps_param branch to the repo. The code is in two new directories: mongo and param. It is still quite rough but you can get the general idea from what is there now.

alanseed avatar Jun 03 '25 10:06 alanseed

Hi @alanseed, thanks again for sharing this. It's an impressive piece of work so far!

For including the code in pysteps, I can imagine we would want to filter out those nowcasting and blending approaches that are different from the already existing approaches and include that in the nowcasting and blending folders in some way. Besides that, we can then keep the mongo setup and also the local parameterization/setup separate and, particularly for the mongo connection, see if we can include that in the code or keep it in a separate branch of pysteps or so. @dnerini, what are your thoughts?

RubenImhoff avatar Jun 03 '25 16:06 RubenImhoff

Hi @alanseed, so great to see you here :) thanks a lot for sharing here your work and apologies for the slow response from my side.

With respect to the discussion above, I wonder if the mongo setup goes a bit beyond the scope of pysteps, which has more of a scientific take in the sense that it is mostly a collection of methods found in the literature rather than a fully-fledged implementation of an operational system.

Instead, I wonder if it couldn't be possible to implement in the pysteps library only the more scientific developments (such as those under the param module), while the mongo logic could be hosted in a separate project using pysteps as a dependency.

Your thoughts? Would this make sense to you?

ps: I invited you to join the pysteps organization, so that you can directly create branches in the original pysteps repo without having to fork the project in your personal space.

dnerini avatar Jun 05 '25 19:06 dnerini

HI @dnerini Nice to be able to work with you again, I have been missing my collaborations with the pysteps community. I was thinking similar thoughts since I remember previous discussions at the start of the project regarding the scope of pysteps. The mongo code assumes a particular design for the database and this is only guaranteed if the helper applications are used to set up the database. Setting up a separate project that depends on pysteps is a good idea I think. There are some functions for netCDF that may be useful in the IO directory, mainly the data compression and CF metadata aspects.

alanseed avatar Jun 06 '25 22:06 alanseed

And the parametric approach you developed for STEPS could in principle be run without the mongoDB logic?

dnerini avatar Jun 07 '25 10:06 dnerini

The database was used to simplify blending, but a simple nowcast based on a fixed parameter vector during the forecast lead times does not need a database. The shared_utils.update_field returns a field that is conditioned on a list of input cascades and advection fields, this is the key function. The rest is just housekeeping.

I have used a dictionary to contain the cascade and optical flow, metadata,

states[key] = {
            "cascade": copy.deepcopy(cascade_dict) if cascade_dict is not None else None,
            "optical_flow": oflow.copy() if oflow is not None else None,
            "metadata": copy.deepcopy(metadata_dict)
 }

We could make a class that holds the states and parameters and generates the nowcasts, perhaps returning the ensemble as
an xarray with coordinates etc etc?

alanseed avatar Jun 07 '25 21:06 alanseed

I have added the transformer data class to manage data trasformations through an object that retains its own metadata, rather than to have to keep track of the transformation metadata manually. You can find it in my pysteps_param branch as utils/transformer.py. I have decided to use the dbr transformation with the rain/no rain threshold = 1.0 mm/h and the zerovalue for the dbr field set at -0.1. This reduced the added variance at the lowest level in the cascade due to a large difference between the threshold in dbr and the zerovalue.

alanseed avatar Aug 06 '25 00:08 alanseed

Also, I have published my rainfields_db repo that manages the setup and IO for the MongoDB. This includes utilities for reading and writing CF netCDF files as 16-bit integers and a file naming manager.

alanseed avatar Aug 06 '25 00:08 alanseed