scikit-extremes icon indicating copy to clipboard operation
scikit-extremes copied to clipboard

Sign Convention Used in this package

Open chiaral opened this issue 2 years ago • 4 comments

@jaymay2002 and I are using your package to fit some annual maxima. I sort of stumbled on the sign convention of this package, and I am not sure anymore what is the sign convention for it.

Little summary: The typical convention I am used to - but definitely is not necessary right or wrong - is the one presented also for reference in the Wiki page: Positive shape parameters are Frechet/type II distribution (+/II), Negative shape parameters are Weibull/type III distribution (-/III).

If I use the scipy.stats.genextreme I know that they use the opposite sign convention, they mention it in the documentation:

import xarray as xr
import numpy as np
from scipy.stats import genextreme
import matplotlib.pyplot as plt
import scipy.stats as st

#generate some values mimicking 110 AM
N = 110
px = np.arange(1, N+1)/(N+1)
negative_shape_st = genextreme.ppf(px, -0.25, loc=20, scale=10)
positive_shape_st = genextreme.ppf(px, 0.25, loc=20, scale=10)

When I plot them - and I think I am doing it correctly here, I am going to copy the full code for clarity on my steps

plt.figure()
ax= plt.subplot(111)
plt.plot(negative_shape_st, -np.log(-np.log(px)), 'r.-', label= 'negative genextreme.shape')
plt.plot(positive_shape_st, -np.log(-np.log(px)), 'b.-', label= 'positive genextreme.shape')
RT= np.array([1.1,2,5,10,15,20,25,50, 100])
px_RT = 1-1/RT
ax.set_yticks(-np.log(-np.log(px_RT)))
ax.set_yticklabels(RT)
plt.legend()

I get this figure (please note that the probability/return period are on the y-axis): image

The Frechet like distribution (fat tail in red) corresponds to a negative shape parameter, and viceversa for the Weibull like distribution (positively bounded in blue). This confirms me that the scipy.stats.genextreme has the opposite behaviour of the typical, let's call it WIKI, WIKI convention.

Now let's move to this package. In the documentation you state that:

  • 𝜉>0 correspond to the Frêchet (type II) and

  • 𝜉<0 correspond to the Weibull (type III)

In all my experience with L-moments, I always used the +/II and -/III convention, so I was expecting the same from this package, however when I pass these generated values to scikit-extremes I get

param_neg = skextremes.models.classic.GEV(negative_shape,fit_method="lmoments",frec=1) 
print(param_neg.params)

param_pos = skextremes.models.classic.GEV(positive_shape,fit_method="lmoments",frec=1) 
print(param_pos.params)

OrderedDict([('shape', -0.2060769098499644), ('location', 20.136579831214807), ('scale', 9.937463870947118)])
OrderedDict([('shape', 0.2485270265980506), ('location', 20.05459337496466), ('scale', 9.81974150422217)])

So scikit-extremes estimates a negative (positive) parameter for the generated values with a negative (positive) paramaeters in scipy.stats.genextreme. So scikit-extremes seems to have a consistent sign convention withscipy.stats.genextreme. (We checked all fitting methods and it is across methods).

Instead if we use the WIKI equations:

x = np.arange(1, 120) # generate  i.e. precipitation values

# fix parameters
location = 20
scale= 10

# WEIBULL
neg_shape= -0.25
x_neg= x[x<location-scale/neg_shape] # the weibull has a upper bound
t_x_neg = (1+neg_shape*((x_neg-location)/scale))**(-1/neg_shape)
CDF_neg = np.exp(-t_x_neg)

# FRECHET
pos_shape=  0.25
t_x_pos = (1+pos_shape*((x-location)/scale))**(-1/pos_shape)
CDF_pos = np.exp(-t_x_pos)

plt.figure()
ax= plt.subplot(111)
plt.plot(x_neg, -np.log(-np.log(CDF_neg)), 'r.-', label= 'negative genextreme.shape')
plt.plot(x, -np.log(-np.log(CDF_pos)), 'b.-', label= 'positive genextreme.shape')
RT= np.array([1.1,2,5,10,15,20,25,50, 100])
px_RT = 1-1/RT
ax.set_yticks(-np.log(-np.log(px_RT)))
ax.set_yticklabels(RT)

plt.plot(negative_shape_st, -np.log(-np.log(px)), 'ro', label= 'negative genextreme.shape')
plt.plot(positive_shape_st, -np.log(-np.log(px)), 'bo', label= 'positive genextreme.shape')
plt.legend()

image

The convention is opposite, in that using the WIKI convention a positive shape corresponds to a fat tail, a negative shape to a bounded distribution.

Did I get this right? No sign convention "is right or wrong" but the documentation now is misleading.

Thanks!

EDIT: I had tagged the wrong Jaymay apologies

chiaral avatar Apr 04 '22 15:04 chiaral