[Feature] Allow perturbing a configuration
space = ConfigurationSpace({...})
optimum = space.sample_configuration() # Or more likely evaluation
close_neighbor = config.perturb(std=0.05) # std on a scale of 0-1
std doesn't really make sense as a name since it's more of a percentage sphere around a config.
One thing is that is unclear behaviour for categoricals.
Here's some reference work for mfpbench::Config::perturb():
def perturb(
value: ValueT,
hp: (
Constant
| UniformIntegerHyperparameter
| UniformFloatHyperparameter
| NormalIntegerHyperparameter
| NormalFloatHyperparameter
| CategoricalHyperparameter
| OrdinalHyperparameter
),
std: float,
seed: int | np.random.RandomState | None = None,
) -> ValueT:
# TODO:
# * https://github.com/automl/ConfigSpace/issues/289
assert 0 <= std <= 1, "Noise must be between 0 and 1"
rng: np.random.RandomState
if seed is None:
rng = np.random.RandomState()
elif isinstance(seed, int):
rng = np.random.RandomState(seed)
else:
rng = seed
if isinstance(hp, Constant):
return value
if isinstance(
hp,
(
NormalIntegerHyperparameter,
NormalFloatHyperparameter,
UniformFloatHyperparameter,
UniformIntegerHyperparameter,
),
):
# TODO:
# * https://github.com/automl/ConfigSpace/issues/287
# * https://github.com/automl/ConfigSpace/issues/290
# * https://github.com/automl/ConfigSpace/issues/291
assert hp.upper is not None and hp.lower is not None
assert hp.q is None
assert isinstance(value, (int, float))
if isinstance(hp, UniformIntegerHyperparameter):
if hp.log:
_lower = np.log(hp.lower)
_upper = np.log(hp.upper)
else:
_lower = hp.lower
_upper = hp.upper
elif isinstance(hp, NormalIntegerHyperparameter):
_lower = hp.nfhp._lower
_upper = hp.nfhp._upper
elif isinstance(hp, (UniformFloatHyperparameter, NormalFloatHyperparameter)):
_lower = hp._lower
_upper = hp._upper
else:
raise RuntimeError("Wut")
space_length = std * (_upper - _lower)
rescaled_std = std * space_length
if not hp.log:
sample = np.clip(rng.normal(value, rescaled_std), _lower, _upper)
else:
logged_value = np.log(value)
sample = rng.normal(logged_value, rescaled_std)
sample = np.clip(np.exp(sample), hp.lower, hp.upper)
if isinstance(hp, (UniformIntegerHyperparameter, NormalIntegerHyperparameter)):
return int(np.rint(sample))
elif isinstance(hp, (UniformFloatHyperparameter, NormalFloatHyperparameter)):
return float(sample) # type: ignore
else:
raise RuntimeError("Please report to github, shouldn't get here")
# if isinstance(hp, (BetaIntegerHyperparameter, BetaFloatHyperparameter)):
# TODO
# raise NotImplementedError(
# "BetaIntegerHyperparameter, BetaFloatHyperparameter not implemented"
# )
if isinstance(hp, CategoricalHyperparameter):
# We basically with (1 - std) choose the same value, otherwise uniformly select
# at random
if rng.uniform() < 1 - std:
return value
choices = set(hp.choices) - {value}
return rng.choice(list(choices))
if isinstance(hp, OrdinalHyperparameter):
# TODO:
# * https://github.com/automl/ConfigSpace/issues/288
# We build a normal centered at the index of value
# which acts on index spacings
index_value = hp.sequence.index(value)
index_std = std * len(hp.sequence)
normal_value = rng.normal(index_value, index_std)
index = int(np.rint(np.clip(normal_value, 0, len(hp.sequence))))
return hp.sequence[index]
raise ValueError(f"Can't perturb {hp}")
Hey, I think this is pretty close to the neighborhood retrieval currently implemented. What would be the exact difference?
The get one exchange neighborhood acts slightly different from what I'm aware (at least from the name it sounds like it would be different). Using the get_neighbors functions are not useful, as they act on the values stored in np.ndarray in the configuration, i.e. they're very much private functions.
If the get_once_exchange_neighborhood function could be used for this exact same effect, then we should attach it as a function to a configuration, get_one_exchange_neigborhood from ConfigSpace.util is not somewhere I would look.
Edit: I looked at one exchange, this only acts on one HP at a time by the looks of it and we also needed to treat categoricals with some sort of "strength" to stick to the current categorical, as captured here:
if isinstance(hp, CategoricalHyperparameter):
# We basically with (1 - std) choose the same value, otherwise uniformly select
# at random
if rng.uniform() < 1 - std:
return value
choices = set(hp.choices) - {value}
return rng.choice(list(choices))
This is because with a low std.dev like 0.1, we would like it to just stick to the same categorical 90% of the time.
I guess functionality like this in ConfigSpace is different then the specific method get_one_exchange_neighborhood and get_neighbors is not friendly enough to use. (Also from a practical standpoint, I tried get_neighbors for the uniforms but we are locked to ConfigSpace version where using get_neighbors got stuck in rejection sampling).
I think the best course of action is a more useable form of get_neighbors on each HP, making get_neighbours into something private, since it's on an optimized hotloop that works with the scaled np.ndarray values that require transformations which are non obvious and prone to silent errors.