smt GPX doesn't support vector outputs

The GPX surrogate doesn't seem to support outputs with ny>1, which works for the other surrogates. It runs, but seems to return the response for the first element only. I guess it could be fixed in here in SMT, but possibly also in the upstream egobox tool. Here's a small test that shows the difference between KRG and GPX:

import unittest
import numpy as np

from smt.surrogate_models import GPX, KRG

from smt.examples.rans_crm_wing.rans_crm_wing import get_rans_crm_wing


class TestNY(unittest.TestCase):
    def test_ny_gpx(self):
        xt, yt, _ = get_rans_crm_wing()

        interp = GPX()
        interp.set_training_values(xt, yt)
        interp.train()
        v0 = np.zeros((4, 2))
        for ix, i in enumerate([10, 11, 12, 13]):
            v0[ix, :] = interp.predict_values(np.atleast_2d(xt[i, :]))
        v1 = interp.predict_values(np.atleast_2d(xt[10:14, :]))

        self.assertEqual(v1.shape[1], yt.shape[1])

    def test_ny_krg(self):
        xt, yt, _ = get_rans_crm_wing()

        interp = KRG()
        interp.set_training_values(xt, yt)
        interp.train()
        v0 = np.zeros((4, 2))
        for ix, i in enumerate([10, 11, 12, 13]):
            v0[ix, :] = interp.predict_values(np.atleast_2d(xt[i, :]))
        v1 = interp.predict_values(np.atleast_2d(xt[10:14, :]))
        self.assertEqual(v1.shape[1], yt.shape[1])

Nov 20 '24 13:11 fzahle

Actually, KRG is not meant to handle multiple outputs. It has been recently added (too discreetly!) at the end of the Kriging section but we should definitely raise an error if yt is not one-dimensional. When you use several training outputs you do not get the right predictions (the ones you get if you keep only one training output). For GPX (or should I say egobox::Gpx), it happens that with the default setting only the first predicted output is kept, but still if you pass several training output data, the prediction you get is not the right one.

I keep this issue open as a reminder till we fix KRG (and kriging-based surrogates) and GPX (ie. raising a ValueError("Multiple outputs not handled") when yt is not one-dimensional).

Nov 20 '24 18:11 relf

After discussion, we start by adding a warning when using several training outputs instead of a hard error (PR #686). Though it is not the best one, a surrogate trained with multiple outputs may work with your use case depending your data. Out of curiosity, how does it work for you with KRG and multiple outputs? How many outputs do you have? Would it be feasible for you to train separately? If there is a use case I can "fix" egobox::Gpx to reproduce the same behaviour as KRG

Dec 04 '24 15:12 relf

Below, just an example to see how multi-outputs kriging can go wrong. I've just used sphere and cos(10*x) functions. If we train against both at once with KRG(), we get the following plot: sphere_cos10x (actually, the hyperparameter optimization just fails, it ends up with the default value theta = 0.01) While if we train against the functions separately, we get sphere (optimized theta = 0.0036) cos10x (optimized theta = 2.2243)

Dec 13 '24 08:12 relf

@relf thanks for going thoroughly into this. I'll change our OpenMDAO wrapper to stamp out a surrogate for each dimension of yt.

As to how it works with multiple outputs, I'm getting mixed results, that's why I was starting to look more into it. We train on 2D RANS CFD airfoil flows, where yt spans over multiple AOAs.

Dec 13 '24 09:12 fzahle

@fzahle, thanks. At the moment, kriging-based surrogate models emit a warning when using several training outputs but it has to become a hard error in the future.

Dec 24 '24 10:12 relf

smt smt copied to clipboard

GPX doesn't support vector outputs

smt
smt copied to clipboard