climate_indices icon indicating copy to clipboard operation
climate_indices copied to clipboard

computation time of self calibration (scpdsi)

Open lukruh opened this issue 4 years ago • 8 comments

hey, I just found and tried this library. Unfortunately I was not able to use the indeces.scpdsi() method, it seems it takes forever. I also tried your example notebook (canadian stations) and did some logging, it stays in the self calibration progress for several hours without a result. Is it supposed to take so long or am I doing something wrong?

  • I used the latest version from pip
  • Installed everything through pip and ran the code in python (no jupyter no conda)
  • the pdsi (without self calibration) finishes in seconds
  • disabled the numba decorator as in #370

lukruh avatar Apr 06 '20 09:04 lukruh

Thanks @lukruh

You may have discovered a bug in the self-calibration section for scPDSI. The PDSI and scPDSI codes are still somewhat experimental. My experience has been similar to yours in that it takes forever to compute scPDSI on a dataset of any size.

I apologize for not having more of an explanation/solution for this yet.

monocongo avatar Apr 06 '20 16:04 monocongo

I'll play around with the simple PDSI for now. When I've enough time I'll dig into the calibration method and share what ever problem I can find/fix.

lukruh avatar Apr 09 '20 15:04 lukruh

Hi, thanks for this great package! I was wondering if there is any update on this issue.

Also, could you provide an option to calculate scPDSI alone so that it allows the situation that "awc" is not provided? According to the definition here, scPDSI can be estimated based on temperature & precipitation without "awc", which is not always available (and scPDSI seems not very sensitive to awc?).

Thanks a lot!

fzhu2e avatar Apr 30 '20 00:04 fzhu2e

Sorry, @fzhu2e no progress yet on the slow performance issue with scPDSI. I'm not sure I'll ever have enough time to fix that.

If you can add support for a missing AWC for scPDSI then please submit a PR. I'm not aware of a way to calculate Palmers without AWC, but if there is a method then it would be good to have an implementation here.

monocongo avatar Apr 30 '20 02:04 monocongo

Hi, I had the very same problem and spent some time looking for the bug.

Commenting out the numba decorator for the "_z_sum" function in _palmer.py (line 1859) worked for me.

#@numba.jit def _z_sum(

However, I can not give an explanation.

I used the code snipped in #370 for testing.

@monocongo: Thanks for the package. This is exactly what I was looking for. Now I get a ZeroDivision Exception for my netcdf Data. Will see if I can find this bug too.

awellpott avatar Oct 14 '20 18:10 awellpott

Thanks for the update, @awellpott

If you want to post a link to your NetCDF file I can try to reproduce the division by zero error on my side. The last time I tried running Palmers on a large dataset I also hit a division by zero error. I never had time to find the problem, so this may be the same thing, and if so it'd be good to get to the bottom of it.

monocongo avatar Oct 20 '20 16:10 monocongo

Hello,

Thanks for the package. I have been using it with large gridded data very easily.

I'm not sure if you had time to look into the zero division / math domain error @monocongo

I'm not an expert in PDSI but I ended with similar issue as the one mentioned above. I had a look at the source code and I think the issue comes from pixels which never have precipitation. In this case, in function _climatic_characteristic (in palmer.py), I end up with P_bar[i] and L_bar[i] both equal to 0 when computing K_hat leading to the 0 division. T_hat[i] = (PET_bar[i] + R_bar[i] + RO_bar[i]) / (P_bar[i] + L_bar[i]) A simple tweak like this prevents the code from crashing though I'm not sure it behave as it should in this case.

       if P_bar[i] == 0 and  L_bar[i] == 0 : T_hat[i] = 0
       else : T_hat[i] = (PET_bar[i] + R_bar[i] + RO_bar[i]) / (P_bar[i] + L_bar[i])

Maybe another solution would be to replace the precipitation = 0 by nans and therefore the index at this pixel would be nan (like in desert)

    P_bar = np.nanmean(precip, axis=0)
    P_bar[ P_bar == 0 ] = np.nan

Hopefully this could help you.

Nbruneau9 avatar Nov 19 '20 21:11 Nbruneau9

Thanks, @Nbruneau9 this is very helpful!

I'm not sure at all of the ramifications of these proposed changes, and to be honest I only ever had a loose understanding of the implementation of the PDSI and scPDSI -- this code is a translation of the original Fortran used at NOAA in the 80s plus other updates gleaned from various other implementations. To be honest now that I've had peek under the covers (I worked on the PDSI for almost two years off and on) I'd not use Palmer indices in my research, as there are too many implementations and none of them match (this package is an attempt to remedy that and provide some standardization), and the algorithmm itself is so complicated that even the experts are confused by it (I worked with them for years, they just trust the Fortran code and it's a steaming pile of cruft with numerous disparate versions in use just at NOAA alone).

If we can somehow compare the results of running the code with these updates vs. other PDSI datasets then it may be a way to determine if this is reasonable or not. I'm happy to update the code with this fix if so. Or if you know of any other way to validate this implementation of the Palmer indices (the others are easy) then please advise. In any event I really appreciate your help, and I'm glad to know that this package is useful to you.

monocongo avatar Nov 21 '20 19:11 monocongo