GaussianProcessRegressionModel - None index_points

Open ducvinh-nguyen opened this issue 3 years ago • 0 comments

Hello,

The problem arises when set up a GaussianProcessRegressionModel without specifying the index_points. I provide it later when predicting, with methods like model.mean, model.stddev and model.sample, as I feel more natural this way. This should not be a problem. However, why model.mean, model.stddev do theri job correctly, the model.sample gives the error below:

Traceback (most recent call last):
  File "example_issue_TFP_None.py", line 59, in <module>
    samples = model.sample(1, index_points=index_points).numpy()
  File "/home/vinh/miniconda3/envs/data/lib/python3.8/site-packages/tensorflow_probability/python/distributions/distribution.py", line 1234, in sample
    return self._call_sample_n(sample_shape, seed, **kwargs)
  File "/home/vinh/miniconda3/envs/data/lib/python3.8/site-packages/tensorflow_probability/python/distributions/distribution.py", line 1216, in _call_sample_n
    return self._set_sample_static_shape(samples, sample_shape)
  File "/home/vinh/miniconda3/envs/data/lib/python3.8/site-packages/tensorflow_probability/python/distributions/distribution.py", line 2029, in _set_sample_static_shape
    x, self.event_shape, batch_shape)
  File "/home/vinh/miniconda3/envs/data/lib/python3.8/site-packages/tensorflow_probability/python/distributions/distribution.py", line 1169, in event_shape
    self.dtype, tf.TensorShape, self._event_shape(), check_types=False)
  File "/home/vinh/miniconda3/envs/data/lib/python3.8/site-packages/tensorflow_probability/python/distributions/gaussian_process.py", line 652, in _event_shape
    if self._is_univariate_marginal(index_points):
  File "/home/vinh/miniconda3/envs/data/lib/python3.8/site-packages/tensorflow_probability/python/distributions/gaussian_process.py", line 418, in _is_univariate_marginal
    index_points.shape[-(self.kernel.feature_ndims + 1)]
AttributeError: 'NoneType' object has no attribute 'shape'

I monitored the variable index_points in gaussian_process.py and found out that the variable become None only after around 2 evaluations. It seems like a bug somewhere. Hope someone can fix this.

The code below reproduces the error:

# This file aims to reproduce the error in GaussianProcessRegressionModel in tfd
# When we specified index_points later in sample function, index_points becomes None

import numpy as np
import matplotlib.pyplot as plt

import tensorflow as tf
import tensorflow_probability as tfp

tfd = tfp.distributions
tfk = tfp.math.psd_kernels

# Do not use GPU
tf.config.set_visible_devices([], "GPU")
tf.compat.v1.disable_eager_execution()  # adding this line make keras training much slower

# Generate data
def f(x):
    return (x + 1) * np.sin(5 * x)


x_train = np.arange(-1 + 0.05, 1, 0.2)
y_train = f(x_train)

index_points = np.arange(-1 + 0.1, 1, 0.2)

# Plot the problem
plt.figure()
plt.plot(x_train, y_train, "o", label="Training points")
plt.xlim(-1, 1)
plt.ylim(-2, 2)
plt.xlabel("x")
plt.ylabel("f")
plt.grid()
plt.legend()
plt.show(block=False)

# Reshape
X_train = x_train.reshape(x_train.shape[0], 1).astype(np.float32)
Y_train = y_train.reshape(x_train.shape[0], 1).astype(np.float32)

index_points = index_points.reshape(index_points.shape[0], 1).astype(np.float32)

# Simple model
kernel = tfk.ExponentiatedQuadratic(0.5, 0.1)

model = tfd.GaussianProcessRegressionModel(
    kernel=kernel,
    observation_index_points=X_train.astype(np.float32),
    observations=y_train.astype(np.float32),
    observation_noise_variance=0.0,
    predictive_noise_variance=None,
    mean_fn=None,
)

mean = model.mean(index_points=index_points)
std = model.stddev(index_points=index_points)

samples = model.sample(1, index_points=index_points).numpy()

Other question : In sklearn.gaussian_process.GaussianProcessRegressor, we fit the model with the training data first, and predict later. I imagine it is faster and more efficient this way as the "training" has performed some matrix computations on the training set, and uses the result for prediction with some more computation with the "prediction set". In GaussianProcessRegressionModel, in the documentation examples, it seems like it is recommended to provide the prediction set and the training set in the same time. I feel like the model does all the computation from the start each time we perform sampling or calculating the mean on index_points, so that many operations on the training set is performed over again. Is this true? If yes, how can i reproduce the training/prediction separation line in sklearn? I guess this matters a lot when we have a lot of training data.

On the later question, I tried pre-traine the model with model.get_marginal_distribution() method. But the MultivariateNormalLinearOperator output cannot compute the variance or stddev as the Cholesky decomposition failed. The reason is cholesky without jitter, unlike in GaussianProcessRegressor. Jittering is needed as the default float precision is float32 in Tensorflow while float64 is necessary.

Thank you all for this awesome package!

Sep 12 '22 08:09 ducvinh-nguyen