PyKrige icon indicating copy to clipboard operation
PyKrige copied to clipboard

GeoStat-Framework integration: PyKrige v2

Open MuellerSeb opened this issue 5 years ago • 6 comments

Hurray! PyKrige is now part of the GeoStat-Framework

Now we have to think about, how to smoothly integrate PyKrige into this Framework and how to organize the coexistence with GSTools.

With this PR: https://github.com/GeoStat-Framework/GSTools/pull/67 in GSTools a set of kriging routines is introduced in GSTools:

  • simple kriging
  • ordinary kriging
  • universal kriging
  • external drift kriging
  • detrended kriging

All of these procedures work in 1D, 2D and 3D.

PyKrige could work as the extension for fancy kriging, like

  • moving-window
  • regression kriging
  • N-dimensional kriging (#138)
  • integration with scikit-learn (#143)

I think it would be nice to collect stuff, that should be provided by PyKrige and things that could be out-sourced to GSTools to reduce redundancy.

TODOs ATM

  • [x] N-dimensional kriging (to provide rotation, we could simply demand a rotation matrix [orthogonal matrix with det=1]) #133 #31
  • [ ] choosable distance metric #120
  • [ ] add all Variogram-models that are provided in GSTools
  • [ ] bring parametrization of variogram-models in line with GSTools #119
  • [ ] variogram-estimation with GSTools (working on automated estimation ATM) #130 #29 #57 #97
  • [ ] use import-export routines of GSTools for mesh-io #122
  • [x] building wheels with cibuildwheel
  • [x] dropping py2 support (https://python3statement.org/)
  • [x] updating DOC to be in line with the GeoStat style
  • [x] create a separate develop branch; master should hold latest release

Project

https://github.com/GeoStat-Framework/PyKrige/projects/1

What do you think? @rth @bsmurphy @LSchueler

MuellerSeb avatar Jan 27 '20 09:01 MuellerSeb

Thanks for taking this on, @MuellerSeb! A few quick thoughts on this...

The tools for variogram estimation/modeling/etc in PyKrige are admittedly underdeveloped, so relying on your efforts in GSTools would be good I think. Probably won't be too hard to refactor the existing PyKrige code to use the GSTools variogram code.

I think refactoring to ND kriging would be very valuable, and actually shouldn't be too hard in the existing PyKrige framework. And the existing universal kriging drift terms could then be extended into N dimensions.

bsmurphy avatar Jan 28 '20 03:01 bsmurphy

Thanks for the summary @MuellerSeb ! The plan sounds good to me as well.

bring parametrization of variogram-models in line with GSTools #119

One constrain is backward compatibility, and how to make that transition without breaking existing users code. Maybe adding GSTool variogram-models and adding a deprecation warning for some of the current options that would be changed in the future.

dropping py2 support (https://python3statement.org/)

+1

building wheels with cibuildwheel

That would be ideal indeed, though it would likely require some work.

rth avatar Jan 28 '20 12:01 rth

@rth : I would create a 1.5 version incorporating GSTools as proposed in #125 . We could add deprecation warnings there.

Everything else should be done within a 2.0 release, where we can break backward-compatibility since the major version number changes. I would keep all the variogram models, just rescale them. So in the case, where the variogram was estimated, we should get same results.

We are quite experienced with building wheels including cython code. @LSchueler will have a look at it, when he has some spare time.

MuellerSeb avatar Jan 28 '20 12:01 MuellerSeb

You are right, making these changes in 2.0 would probably be best.

rth avatar Jan 28 '20 16:01 rth

One problem, that comes up, when bringing the variogram models in line, is that all models in GSTools are stationary and assume a finite sill of the variogram. So these are incompatible ATM:

  • Linear Model
  • Power Model

In GSTools, the linear model is a truncated one, to provide a finite length scale. Power models are also provided in a truncated way (finite superposition of models on different scales).

We could circumvent this, by setting the length scale bigger than the field-size.

MuellerSeb avatar Mar 29 '20 10:03 MuellerSeb

Sorry for not commenting on this sooner. The linear and power variogram models in PyKrige aren't stationary as defined, and so I suppose then they're not true covariance functions (if I'm remembering the underlying mathematical formalism correctly). I originally included them in the code for completeness following the Kitanidis geostats text. But makes sense to have all variogram models be true covariance functions, so I think your idea @MuellerSeb sounds good.

bsmurphy avatar Apr 05 '20 04:04 bsmurphy