FLAML
FLAML copied to clipboard
Integer log uniform sampling distribution issue
Hi! I was exploring the use of this for some systems optimization and noticed that the log uniform sampling, doesn't appear to be log uniform. It's more skewed towards the leading part of the distribution it seems. Here's a simple test case I cooked up:
https://github.com/bpkroth/FLAML/blob/test-log-uniform-sampling/test/test_log_uniform.py https://github.com/bpkroth/FLAML/blob/a620aa0dc70ff59144d78c735be8adde4b8f997f/test/test_log_uniform.py
I just ran a standalone test using your code and it doesn't fail:
# %%
import flaml
base = 10
rand_float = flaml.tune.loguniform(1, 20, base=base)
samples = rand_float.sample(size=10000)
# %%
import numpy as np
logs = np.log(samples) / np.log(base)
# %%
import numpy.typing as npt
import scipy.stats
def assert_is_uniform(arr: npt.NDArray) -> None:
"""Implements a few tests for uniformity."""
_values, counts = np.unique(arr, return_counts=True)
kurtosis = scipy.stats.kurtosis(arr)
_chi_sq, p_value = scipy.stats.chisquare(counts)
frequencies = counts / len(arr)
assert np.isclose(frequencies.sum(), 1)
_f_chi_sq, f_p_value = scipy.stats.chisquare(frequencies)
assert np.isclose(kurtosis, -1.2, atol=0.1)
assert p_value > 0.5
assert f_p_value > 0.5
# %%
assert_is_uniform(logs)
I'll try to run your test code next.
I ran the test. The failed tests are
FAILED test/test_log_uniform.py::test_flaml_uniform_int - assert 0.33734160716544936 > 0.5 FAILED test/test_log_uniform.py::test_flaml_log_uniform_int - assert 0.0 > 0.5
The failure is not about log. It's about int. Do your tests work for int?
Hmm, in my original repo it was failing only for log, but yeah when I run it in your repo (I just stashed that file in there for an easy place to share it), it fails for log int every time, and sometimes for int. I wonder if there's something about the seed.
For completeness, here's the test setup:
- Clone your repo.
- Open the devcontainer is vscode.
- Run the following (the
-c /dev/null
instructs pytest to ignore the local pytest config files so it's more standalone).
pytest -c /dev/null tests/test_log_uniform.py
With that I get this output:
======================================================================================================================== short test summary info =========================================================================================================================
FAILED ../../../dev/::test_flaml_uniform_int - assert 0.11160117348822314 > 0.5
FAILED ../../../dev/::test_flaml_log_uniform_int - assert False
================================================================================================================ 2 failed, 4 passed, 21 warnings in 3.34s ================================================================================================================
Double checked. The float log uniform was an errant report on my part - I was applying a ceil mistakenly in the stack before doing the test. Removed that and it passes as expected.
The log uniform int issue remains though.
Double checked. The float log uniform was an errant report on my part - I was applying a ceil mistakenly in the stack before doing the test. Removed that and it passes as expected.
The log uniform int issue remains though.
A few facts in case they help:
- the distribution over integers can't be considered as a uniform distribution in the continuous space. The sampling probability concentrates on a few integers.
- the randint(lb, ub) samples integers in [lb, ub), ub not included.
Do these explain why the test didn't pass?