Ax Problem when Sobol falls back to HitAndRunPolytopeSampler

I've been playing around to try to understand the effect of linear inequality constraints on quasi-random sampling, such as Sobol, and I'm seeing some issues where the samples don't seem to uniformly fill the polytope that is defined by the linear constraints (in addition to the usual box constraints) when Sobol is allowed to fall back to HitAndRunPolytopeSampler. I am wondering if this is related to https://github.com/pytorch/botorch/issues/1225, which was fixed a while back but perhaps the code path that Ax is following to call this sampler is not passing through the proper normalization steps that were implemented?

Here's what I mean. First generate a 100-sample Sobol sequence without falling back to the polytope sampler; to make this possible I'll crank up the max number of rejection sampling tries to 1e5:

import matplotlib.pyplot as plt

from ax.service.ax_client import AxClient
from ax.service.utils.instantiation import ObjectiveProperties
from ax.modelbridge.generation_strategy import GenerationStep, GenerationStrategy
from ax.modelbridge.registry import Models
from ax.modelbridge.modelbridge_utils import get_pending_observation_features

num_trials = 100
gs = GenerationStrategy(
    steps=[
        GenerationStep(
            model=Models.SOBOL,
            num_trials=num_trials,
            min_trials_observed=num_trials,
            model_kwargs={
                "seed": 0, 
                "deduplicate": True, 
                "fallback_to_sample_polytope": False
            },
            model_gen_kwargs={}
        )
    ]
)

ax_client = AxClient(generation_strategy=gs)

ax_client.create_experiment(
    name="hartmann_test_experiment",
    parameters=[
        {
            "name": "x1",
            "type": "range",
            "bounds": [0.0, 1.0]
        },
        {
            "name": "x2",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x3",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x4",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x5",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
        {
            "name": "x6",
            "type": "range",
            "bounds": [0.0, 1.0],
        },
    ],
    objectives={"hartmann6": ObjectiveProperties(minimize=True)}, 
    parameter_constraints=[
        "x1 + x2 <= 0.1", 
        "x4 - x3 >= 0.5",
        "x5 - x6 >= 0.0"
        ]
)

for i in range(num_trials):
    generator_run = gs.gen(
        experiment=ax_client.experiment,
        data=None,
        n=1,
        pending_observations=get_pending_observation_features(ax_client.experiment),
        model_gen_options={"max_rs_draws": 1e5}
    )
    trial = ax_client.experiment.new_trial(generator_run)

parameters = [t.arm.parameters for t in ax_client.experiment.trials.values()]

fig, ax = plt.subplots(1, 3, figsize=(8, 3))
xs = np.array([[p[f"x{i+1}"] for p in parameters] for i in range(6)])
ax[0].scatter(xs[0, :], xs[1, :])
ax[1].scatter(xs[2, :], xs[3, :])
ax[2].scatter(xs[4, :], xs[5, :])

for i in range(3):
    ax[i].set_xlim(0, 1)
    ax[i].set_ylim(0, 1)

Producing all 100 generator runs is slow (15 seconds on my computer), probably due to a tiny acceptance fraction during rejection sampling, but the result looks exactly as we'd expect:

If I repeat the experiment above, but setting "fallback_to_sample_polytope": True and "max_rs_draws": 0 to force falling back to polytope sampling, the code runs noticeably faster (4 sec), but the results look quite biased towards the corners:

As a sanity check, I called this sampler directly and got much more reasonable results, although still not as clean-looking as what you get (very slowly) through rejection sampling:

import torch
from torch import tensor
from botorch.utils.sampling import get_polytope_samples

samples = get_polytope_samples(
    n=100,
    bounds = torch.stack((torch.zeros(6), torch.ones(6))),
    inequality_constraints = [
        (torch.arange(6), tensor([-1.0, -1.0, 0, 0, 0, 0]), -0.1),
        (torch.arange(6), tensor([0, 0, -1.0, 1.0, 0, 0]), 0.5),
        (torch.arange(6), tensor([0, 0, 0, 0, 1.0, -1.0]), 0.0)
    ],
    seed=0
)

I'm still pretty new to this so I could easily be doing something wrong, feedback would be welcome!

Apr 17 '24 22:04 parrangoiz

Thanks for flagging this, this does seem like a bug. I'll take a closer look

Apr 27 '24 17:04 Balandat

OK so the issue is that the hit and run sampling fallback in Ax (see here currently neither applies the normalization that you mentioned nor, and more importantly, does it apply thinning to the samples, which results in much more correlated samples. Basically, we need to update the Ax logic to call get_polytope_samples instead. This should be straightforward, the only thing to be careful about is how to manage the handling of the seed between draws so things can be made to work deterministically. I can put up a PR for this.

Apr 28 '24 00:04 Balandat

Thanks for looking into this. Sounds right, may as well call it directly since it seems to be working fine. When you say thinning what exactly does that mean? Sorry not 100% familiar with all the jargon yet.

Apr 28 '24 01:04 parrangoiz

Thinning (removing all but every k sample) is a process to reduce autocorrelation of samples drawn from a MCMC chain. This is happening here: https://github.com/pytorch/botorch/blob/main/botorch/utils/sampling.py#L828 (this is a poor man's implementation, you could avoid storing the discarded samples as you go to save memory but this is typically not of concern for the sample sizes we use for Bayesian Optimization.

Apr 28 '24 14:04 Balandat

Got it, thanks.

Apr 29 '24 20:04 parrangoiz

This has been fixed in #2492

Jun 05 '24 21:06 Balandat

Closing due to newest Ax release landing this week with the fix :)

Jul 24 '24 20:07 mgarrard