Start with low-hanging fruit:

clearer variable names (compare to nowcasts/steps.py refactoring)
potentially make the monolithic function more modular.

Jul 19 '24 09:07 ladc

Yes please!

Jul 19 '24 10:07 dnerini

@RubenImhoff I went trough the whole code to see what was unclear for me. I will already paste my findings here and maybe we can discuss later today. Ill first time some time to eat :p

Specific lines of needed name changes

Line 583: R_f should get a new name

Line 600: R_f_ empty_precip?

Line 616: outarr

Line 677: pp, generate_noise, noise_std_coeffs = _init_noise(

Line 716: randgen_prec, vps, generate_vel_noise = _init_random_generators(

Line 727: D, D_Yn, D_pb, R_f, R_m, mask_rim, struct, fft_objs = _prepare_forecast_loop(

Line 784: forecast_prev = precip_cascade noise_prev = noise_cascade t_prev = [0.0 for j in range(n_ens_members)] t_total = [0.0 for j in range(n_ens_members)]

Line 790: This timestep function is extremly complicated to understand, maybe changing some names would make this more clear

Line 858 rho_extr would change to rho_extrap (just a bit more clear what this is)

Line 865 R_f_ should be renamed: R_f_temp

Line 922: cov -> covariance_radar_model?

Line 955: EPS should be changed, quite unclear what it is

Line 997: EPS_ should be changed

Line 1028: R_f_ip

Line 1033: Yn_ip

Line 1051: t_diff_prev

Line 1065: V_stack

Line 1073: V_model_

Line 1094: R_f_im_recomp

Line 1111: R_f_ep_recomp_

Line 1118: R_f_ep_recomp

Line 1119: temp_mask

Line 1122: R_f_ep

Line 1144: Yn_ip_recomp

Line 1158: Yn_ep_recomp

Line 1159: Yn_ep

Line 1173: R_f_ep_out, Yn_ep_out, R_pm_ep

Line 1285: R_f_out

Line 1339: cascades_stacked_

Line 1356: R_f_blended

Line 1361: R_f_blended_mod_only

Line 1384: R_f_new

Line 1391: R_f_new_mod_only

Line 1412: weights_pm, weights_pm_normalized, weights_pm_mod_only,

Line 1438 Better commenting?

Line 1439: R_pm_blended

Line 1447: R_pm_blended_mod_only

Line 1490: Can be removed

line 1517: R_cmin

Line 1525: MASK_prec -> MASK_precip or MASK_precip_cell?

Line 1550: mu_0, MASK, mu_fct

Line 1578: R_f_stacked

General: All fc should be changed to forecast?

We should look at all these t_... variables, they are quite strange

Do we go for mu and sigma or means and sigmas? Should be uniform choice

R_f = precip_forecast? is this correct?

Jul 19 '24 11:07 sidekock

For all of these, I could not find the names in the nowcast steps but it is really hard to see the variables through the forest :) . I also tried to do this matching by using some generative AI tools but it does not preform well on this sort of tasks

Jul 19 '24 11:07 sidekock

To give a brief answer (already):

General: All fc should be changed to forecast? --> Yes!

We should look at all these t_... variables, they are quite strange --> Fair, if doable :)

Do we go for mu and sigma or means and sigmas? Should be uniform choice --> I would go for means and sigmas then :)

R_f = precip_forecast? is this correct? --> Correct

Jul 19 '24 11:07 RubenImhoff

short comment: when there are so many variables given and returned by functions, it's usually a good indication that you need a class.

so why not refactoring the code into a class and so that the methods can simply access attributes instead of having to pass them around all the time? the main interface can stay the same, that is, forecast()

Jul 19 '24 11:07 dnerini

I agree this is the most elegant solution but maybe we should tackle that problem after name changes and changing the monolith into something more readable?

Jul 19 '24 12:07 sidekock

My work today so the refactoring can be done in the future; things like splitting up the code into more modular functions or making a class can be added after the current steps I list below:

STEP 1: All "R_f" should be changed to "precip_forecast". Use CTRL+F to find all relevant changes in both code and comments STEP 2: Specific names need to be changed; due to step 1, some of them have been changed from the original. Below I list all changed needed and some proposed names (can definitely be changed). The check with CTRL F is added because refactoring in one place does not always guarantee the name is changed in every function.

Line number	Old name	Name after step 1	Proposed name	Check with Ctrl F
600	R_f_	precip_forecast_	precip_forecast	No
616	outarr		precip_forecast_all_members_all_times	No
677	pp		generate_perturb	Yes
716	randgen_prec		randgen_precip	Yes
716	vps		velocity_perturbations	Yes
727, 2321	R_m		precip_forecast_non_perturbed
727, 2303	D		previous_displacement	No
727, 2304	D_Yn		previous_displacement_noise_cascade	No
727, 2305	D_pb		previous_displacement_prob_matching	No
761, 859	rho_extr_prev		rho_extrap_cascade_prev	No
762, 882, 892	rho_extr		rho_extrap_cascade	No
784	forecast_prev		precip_forc_prev_subtimestep	No
785	noise_prev		noise_prev_subtimestep	No
786	t_prev		t_prev_timestep	No
787	t_total		t_leadtime_since_start_forecast	No
922	cov		covariance_nwp_models	No
955, 2387	EPS		epsilon_decomposed	No
955, 964, 2387, 2391	epsilon_decomposed		epsilon_	No
977, 2403	EPS_		epsilon_temp	No
1026	t_diff_prev_int		t_diff_prev_subtimestep_int	No
1028	R_f_ip	precip_forecast_ip	precip_forecast_cascade_subtimestep	No
1033	Yn_ip		noise_cascade_subtimestep	No
1051	t_diff_prev		t_diff_prev_subtimestep	No
1059	velocity_pert		velocity_perturbations_extrapolation	No
1065	V_stack		velocity_stack_all	No
1073	V_model_		velocity_models	No
1094	R_f_ip_recomp	precip_forecast_ip_recomp	precip_forecast_recomp_subtimestep	No
1111	R_f_ep_recomp_	precip_forecast_ep_recomp_	precip_forecast_extrapolated_recomp_subtimestep_temp	No
1118	R_f_ep_recomp	precip_forecast_ep_recomp	precip_forecast_extrapolated_recomp_subtimestep	No
1119	temp_mask			???
1122	R_f_ep	precip_forecast_ep	precip_forecast_extrapolated_decomp	No
1144	Yn_ip_recomp		noise_cascade_recomp	No
1151	Yn_ep_recomp_		noise_extrapolated_recomp_temp	No
1158	Yn_ep_recomp		noise_extrapolated_recomp	No
1159	Yn_ep		noise_extrapolated_decomp	No
1173	R_f_ep_out	precip_forecast_ep_out	precip_forecast_extrapolated_decomp_done	No
1147	Yn_ep_out		noise_extrapolated_decomp_done	No
1192	R_		precip_forecast	No
1194	R_pm_ep		precip_forecast_extrapolated_probability_matching_temp	No
1201	R_pm_ep_		precip_forecast_extrapolated_probability_matching	No
1285	R_f_out	precip_forecast_out	final_blended_forecast	No
1294	cascades_stacked		cascade_stack_all_components	No
1339	cascades_stacked_		cascade_stack_all_components_temp	No
1347	covariance_nwp_models		covariance_all_components	No
1356	R_f_blended	precip_forecast_blended	precip_forecast_blended	No
1361	R_f_blended_mod_only	precip_forecast_blended_mod_only	precip_forecast_blended_mod_only	No
1384	R_f_new	precip_forecast_new	precip_forecast_recomposed	No
1391	R_f_new_mod_only	precip_forecast_new_mod_only	precip_forecast_recomposed_mod_only	No
1412	weights_pm		weights_probability_matching	No
1413	weights_pm_normalized		weights_probability_matching_normalized	No
1415	weights_pm_mod_only		weights_probability_matching_mod_only	No
1438	weights_pm_normalized_mod_only		weights_probability_matching_normalized_mod_only	No
1439	R_pm_blended		precip_forecast_probability_matching_blended	No
1447	R_pm_blended_mod_only		precip_forecast_probability_matching_blended_mod_only	No
1517	R_cmin		precip_forecast_min_value	No
1525	MASK_prec		precip_field_mask	No
1550	Mu_0		mean_probabiltity_matching_forecast	No
1551	MASK		no_rain_mask	No
1552	mu_fct		mean_precip_forecast	No
1578	R_f_stacked	precip_forecast_stacked	precip_forecast_final	No
1955	R_min		precip_forecast_min	No
1998	R_d		precip_forecast_decomp	No
2000	R_		precip_forecast	No
2014	R_c		precip_forecast_cascades	No
2098	R_c		precip_forecast_cascades	No
2165	R_models_pm		precip_forecast_probability_matching	No
2291	R_c		precip_forecast_cascades	No
2321	R_m		????	???
2333	R_c		precip_forecast_cascades	No

STEP 3: Splitting up the code into separate functions (I might add some suggestions here tomorrow)

STEP 4: Changing to a class will make keeping track of all these variables less necessary.

Please provide me with some feedback.

Jul 19 '24 15:07 sidekock

I just went trough the code to try and refactor it. A version This is just a proof of concept is now pushed to try and show my suggestions (however with the old names as I did not push my version with name changes yesterday...) Below I will post the suggestions I applied to the code to have a clear visual what to do once the master branch can be refactored.

Jul 20 '24 14:07 sidekock

Refactoring Plan

Part in Code	Suggested Name	Extra Info/Recommendations
Everything from start till just before #1	`preprocessing`	Check if all variables are still found in function: `zerovalue`
Everything from #2 to just before #2.3	`compute_cascades`	Check if all variables are still found for the function: `mu_extrapolation`, `sigma_extrapolation`
After 2.3.1 in `if zero_precip_radar and zero_model_fields:`	`no_rain_radar_and_nwp`
Bring #3	`initialisze_noise_methods`
Bring #6 into one function except for last line	`initiate_all_random_generators`	Check if all variables are still found: `randgen_prec`, `vps`, `generate_vel_noise`, `mu_noise`, `sigma_noise`, `D`, `D_Yn`, `D_pb`, `R_f`, `R_m`, `mask_rim`, `struct`, `fft_objs`
Everything in worker should be first extracted to a worker function	-	Not a nested definition but a separate function
Then worker should be split up further	-
Bring 8.1.2	`determine_skill_nwp_at_leadtime`
Bring 8.2	`calculate_weights_per_component`
Bring 8.3.1	`calculcate_epsilon_decomposed`
8.3.2	`iterate_ar_extrapolation_per_cascade`
8.3.3	`iterate_ar_noise_per_cascade`
For loop in 8.4	`extrapolate_per_timestep`
Following for loop just before 8.5	`advect_forecast_one_timestep_no_subtimesteps`
In 8.5 if `blend_nwp_members:` else ...	`concatinate_cascade_mean_and_sigmas`
if `weights_method == "spn"`	`weights_method_spn`
Before 8.6	`blend_all_cadcades`
8.6	`recompose_cascades_to_precip_field`
8.7.1	`calculate_probability_matching_member`
8.7.1	`calculate_forecast_probabilistic_member`
8.7.2	`apply_probability_matching`

Jul 20 '24 14:07 sidekock

@RubenImhoff @dnerini would now be a good time to refactor the code in steps blending and nowcasting? I have talked to @mats-knmi about how this would be integrated in the x-array version. This would be done after the changes are done in the current versions and once all tests are passed.

Sep 18 '24 09:09 sidekock

sounds very good!

Sep 18 '24 11:09 dnerini

@dnerini just to confirm if this would be okay. I was planning to refactor this code to a class due to all the variables. Does this conform to the pysteps design philosophy or should I just keep all the variables as local variables and remove the class based approach?

Sep 20 '24 12:09 sidekock

@sidekock Go ahead with the refactoring. Yes, true, at the beginning we explicitly decided to avoid oop, but i believe the limits and shortcomings of that decision are quite clear.. so let's give it a go!

Sep 20 '24 12:09 dnerini

@sidekock, sorry I was on vacation, so therefore my slow response. Great if you want to make a start with this! :)

Sep 21 '24 12:09 RubenImhoff

@sidekock, since I am planning on starting with the blending conversion to xarray as well soon, I would appreciate it if you could create a draft pull request from the branch you are currently working on and add me as (one of) the reviewers. This way I can easily keep track of your changes and take this into account when working on the xarray stuff.

Sep 23 '24 13:09 mats-knmi

@mats-knmi Will do! I will probably only start somewhere later this week. The teaching season is once again upon me so first need to make sure I am ready on that front :)

Sep 23 '24 14:09 sidekock

@mats-knmi I will start this afternoon with the re-factoring. Will start with the nowcast steps to have a template and then look at the blending steps. Ill add you as a reviewer as soon as the nowcast steps is refactored @RubenImhoff can I add you too?

Oct 08 '24 07:10 sidekock

So I started today but I have a general question regarding the _check_inputs function. This is used to check some of the inputs put others are then checked in the forecast loop itself. Would it not be best to add all these checks and preparatory assignments (eg None -> dict()) in this _check_inputs?

Oct 08 '24 14:10 sidekock

@mats-knmi I will start this afternoon with the re-factoring. Will start with the nowcast steps to have a template and then look at the blending steps. Ill add you as a reviewer as soon as the nowcast steps is refactored @RubenImhoff can I add you too?

Of course! :)

Oct 08 '24 15:10 RubenImhoff

So I started today but I have a general question regarding the _check_inputs function. This is used to check some of the inputs put others are then checked in the forecast loop itself. Would it not be best to add all these checks and preparatory assignments (eg None -> dict()) in this _check_inputs?

That may be a better way indeed.

Oct 08 '24 15:10 RubenImhoff

pysteps
pysteps copied to clipboard

Refactor the blending/steps.py nowcasting code

Refactoring Plan

pysteps pysteps copied to clipboard

Refactor the blending/steps.py nowcasting code

Refactoring Plan

pysteps
pysteps copied to clipboard