numpyro
numpyro copied to clipboard
transform between domain and codomain of corr_matrix not stable for init_to_uniform strategy
Please look at this forum topic for a context of this issue.
I think it is a bit tricky to improve precision using the current transforms:
import numpyro.distributions as dist
from jax import random, lax
import jax.numpy as jnp
import numpyro
d = 10
x = random.uniform(random.PRNGKey(0), (d * (d + 1) // 2,), minval=-2., maxval=2.)
print("min eigen of corr_matrix",
jnp.linalg.eig(dist.biject_to(dist.constraints.corr_matrix)(x))[0].real.min())
corr_tril = dist.util.signed_stick_breaking_tril(jnp.tanh(x))
print("min diagonal of corr_matrix_tril", jnp.diag(corr_tril).min())
print("min eigen of corr_matrix computed manually",
jnp.linalg.eig(corr_tril @ corr_tril.T)[0].real.min())
shows that even for 10x10 matrices, the operator L @ L.T will give us a non positive definite matrix even though L is a valid lower triangular matrix of some positive definite matrix. Until we can derive a better transform, I would recommend to work with corr_cholesky instead.
In the model, you just need to add an additional step:
L = sample("L", dist.LKJCholesky(dim)) # or dist.ImproperUniform(corr_cholesky, ...))
corr_matrix = L @ L.T
Hi @fehiepsi,
I found that the positive definite problem even occurs when using LKJCholesky distribution.
Code to reproduce:
# utility for PD check
def is_pos_def(x):
return np.all(np.linalg.eigvals(np.array(x)) > 0)
# generate random key
rng_key = random.PRNGKey(0)
# sample for lower triangular matrix
dim_L_Sigma = 16
eta_L_Sigma = jnp.ones(1)
L_Sigma = numpyro.sample("L_Sigma",
dist.LKJCholesky(dim_L_Sigma, eta_L_Sigma),
rng_key=rng_key)
# sample for vector of variances
theta_L_Sigma = numpyro.sample("theta_L_Sigma",
dist.HalfCauchy(jnp.ones(dim_L_Sigma)),
rng_key=rng_key)
sigma_L_Sigma = jnp.sqrt(theta_L_Sigma)
# get covariance matrix
Sigma_ = jnp.outer(sigma_L_Sigma, sigma_L_Sigma) * (L_Sigma @ L_Sigma.T)
# check for PD condition
print('Is Sigma_ postive definite:', is_pos_def(Sigma_))
Is there any suggestion to slove it?
Many thanks!
Sorry for proposing the problem...
Just checked with the shape: sample from dist.LKJCholesky is a 3-dimensional matrix of shape (1, d, d)
If I add a reshape operation to squeeze it, the result is as expected -- a positive definite matrix:
# generate random key
rng_key = random.PRNGKey(0)
# sample for lower triangular matrix
dim_L_Sigma = 16
eta_L_Sigma = jnp.ones(1)
L_Sigma = numpyro.sample("L_Sigma",
dist.LKJCholesky(dim_L_Sigma, eta_L_Sigma),
rng_key=rng_key).reshape(dim_L_Sigma, dim_L_Sigma)
# sample for vector of variances
theta_L_Sigma = numpyro.sample("theta_L_Sigma",
dist.HalfCauchy(jnp.ones(dim_L_Sigma)),
rng_key=rng_key)
sigma_L_Sigma = jnp.sqrt(theta_L_Sigma)
# get covariance matrix
Sigma_ = jnp.outer(sigma_L_Sigma, sigma_L_Sigma) * (L_Sigma @ L_Sigma.T)
# check for PD condition
print('Is Sigma_ postive definite:', is_pos_def(Sigma_))
Sorry I'm here again...
I found that when I put it in a model, it will break the PD rule when preparing for a run of MCMC:
Updated PD check utility:
def is_pos_def(x):
return jnp.linalg.eigh(x)[0].real.min() > 0
Code Segment in the model:
dim_L_Sigma = dataset.num_unconstraints
eta_L_Sigma = jnp.ones(1)
L_Sigma = numpyro.sample("L_Sigma",
dist.LKJCholesky(dim_L_Sigma, eta_L_Sigma)).reshape(dim_L_Sigma, dim_L_Sigma)
# Vector of variances for each of the dim_L_Sigma variables
theta_L_Sigma = numpyro.sample("theta_L_Sigma",
dist.HalfCauchy(jnp.ones(dim_L_Sigma)))
sigma_L_Sigma = jnp.sqrt(theta_L_Sigma)
L_Sigma_ = jnp.matmul(jnp.diag(sigma_L_Sigma), L_Sigma)
Sigma_ = jnp.outer(sigma_L_Sigma, sigma_L_Sigma) * (L_Sigma @ L_Sigma.T)
print('Is Sigma_ postive definite:', is_pos_def(Sigma_))
Code for MCMC:
nuts_kernel = NUTS(model)
mcmc = MCMC(nuts_kernel, num_samples=2000, num_warmup=2000)
rng_key = random.PRNGKey(0)
mcmc.run(rng_key, data)
Output for a MCMC run:
Is Sigma_ postive definite: True Is Sigma_ postive definite: True Is Sigma_ postive definite: Traced<ShapedArray(bool[])>with<DynamicJaxprTrace(level=1/0)> Is Sigma_ postive definite: False
And the MCMC run stopped with a Error:
ValueError: MultivariateNormal distribution got invalid covariance_matrix parameter.
since I use Sigma_ as a covriance matrix of a MultivariateNormal distribution later.
--
What could be the potential problem behind? or any suggetion to solve it?
Thanks for any feedback!
--
Additional Information: My model will have a specific priopr structure so that I have to assign the initial parameters by myself.
Update:
I have solved the proposed problem by checking my computational trace within the model, fixing some numerical overflow issues (e.g., with jnp.exp() for deriving categorical logits).
Hope my experience can help others facing similar problems.