diptest
diptest copied to clipboard
Dictionary with full statistics
I'm not sure what the dictionary with full statistics returns. Also for two extremely well separated Gaussians shouldn't the dip statistic be around 0.5 instead of 0.2?
I've made some plots to make it easier to understand, and the lcm and gcm appear to be only on one Gaussian. My expectation was that one of them would go from left two right tangent below and the other tangent above with a large difference in the middle where they will be separated by distance of about 0.5.
import numpy as np
import diptest
import matplotlib.pyplot as plt
# generate some bimodal random draws
N = 1000
hN = N // 2
x = np.empty(N, dtype=np.float64)
x[:hN] = np.random.normal(-1, 0.2, hN)
x[hN:] = np.random.normal(1, 0.2, hN)
x = np.sort(x) # for plotting convenience
y = np.linspace(0,1,N)
dip, pval, res = diptest.diptest(x,full_output=True)
fig, ax = plt.subplots(2,1,sharex=True)
# PDF axis
ax[0].hist(x,bins=50)
# CDF axis
ax[1].plot(x,y)
ax[1].plot(x[res['gcm']], y[res['gcm']],'x-')
ax[1].plot(x[res['lcm']], y[res['lcm']],'x-')
ax[1].plot([x[res['dipidx']]],[y[res['dipidx']]], 'x-')
ax[1].axvline(res['xl'])
ax[1].axvline(res['xu'])
plt.show()