BOAH icon indicating copy to clipboard operation
BOAH copied to clipboard

Handling boolean parameter

Open KEggensperger opened this issue 6 years ago • 11 comments

Not sure, where to put this, but if you add the following lines to get_configspace() in example_mlp_on_digits.py the notebook crashes:

boolean = CSH.CategoricalHyperparameter('boolean', [True, False])
config_space.add_hyperparameter(boolean)

I noticed that when I was trying to run CAVE on my own data, so I guess having such a parameter is a common use case. Is there anything I can do about this or is it already fixed in a newer version not available via pip?

KEggensperger avatar Jan 08 '19 11:01 KEggensperger

I will investigate this. Thank you for reporting.

shukon avatar Jan 09 '19 15:01 shukon

Marius told me that this should work when writing jsons files (and I saw some commits moving from pcs to json for others repositories). Is there currently a combination of CAVE/BOHB/configspace versions where the problem is fixed?

KEggensperger avatar Jan 09 '19 15:01 KEggensperger

Hey, I faced a similar problem. I think it is caused by how the pcs format stores and restores: It doesn't store type information very precisely.

Here is a little example:

import ConfigSpace as CS
from ConfigSpace.read_and_write import json, pcs_new
from pathlib import Path

cs = CS.ConfigurationSpace()
cs.add_hyperparameters([
    CS.CategoricalHyperparameter('categorical_num', [1, 2, 3]),
    CS.CategoricalHyperparameter('boolean', [True, False])
])

# save this configspace as pcs_new and json.
output_dir = Path.cwd()
with open(output_dir / 'config.pcs', 'w') as f:
    f.write(pcs_new.write(cs))
with open(output_dir / 'config.json', 'w') as f:
    f.write(json.write(cs))
with open(output_dir / 'config.pcs', 'r') as f:
    cs_pcs = pcs_new.read(f)
with open(output_dir / 'config.json', 'r') as f:
    cs_json = json.read(f.read())

cfg_pcs = cs_pcs.sample_configuration()
cfg_json = cs_json.sample_configuration()

print('PCS-New:\n {} Type Boolean {},\n Type categorical_num {}'.format(
    cfg_pcs, type(cfg_pcs['boolean']), type(cfg_pcs['categorical_num'])))
print('\nJSON:\n {} Type Boolean {},\n Type categorical_num {}'.format(
    cfg_json, type(cfg_json['boolean']), type(cfg_json['categorical_num']))) 

OUTPUT:

PCS-New:
 Configuration:
  boolean, Value: 'False'
  categorical_num, Value: '1'
 Type Boolean <class 'str'>,
 Type categorical_num <class 'str'>

JSON:
 Configuration:
  boolean, Value: False
  categorical_num, Value: 1
 Type Boolean <class 'bool'>,
 Type categorical_num <class 'int'>

As you can see, it restores everything as string. I guess that causes the error (, when the configuration space is read in/reused in Cave).

In which line does the notebook crash? The Fmin interface uses the json format, and thus should work. (development branch)

PhMueller avatar Jan 10 '19 20:01 PhMueller

Hmm .. I still can't run it. If I output a .json configspace (and using cave==1.1.4.dev0) I am getting the following error in the notebook for the bash command 2.1) Creating a HTML-report with CAVE:

ERROR:cave.cavefacade.CAVE:Trying to set illegal value 'True' forhyperparameter boolean, Type: Categorical, Choices: {True, False}, Default: True.
Traceback (most recent call last):
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/cavefacade.py", line 248, in __init__
    validation_format=validation_format))
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/reader/configurator_run.py", line 70, in __init__
    self.original_runhistory = self.reader.get_runhistory(self.scen.cs)
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/reader/smac3_reader.py", line 55, in get_runhistory
    rh.load_json(rh_fn, cs)
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/smac/runhistory/runhistory.py", line 351, in load_json
    for id_, values in all_data["configs"].items()}
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/smac/runhistory/runhistory.py", line 351, in <dictcomp>
    for id_, values in all_data["configs"].items()}
  File "ConfigSpace/configuration_space.py", line 1010, in ConfigSpace.configuration_space.Configuration.__init__
ValueError: Trying to set illegal value 'True' forhyperparameter boolean, Type: Categorical, Choices: {True, False}, Default: True.
ERROR:cave.cavefacade.CAVE:Trying to set illegal value 'True' forhyperparameter boolean, Type: Categorical, Choices: {True, False}, Default: True.
Traceback (most recent call last):
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/cavefacade.py", line 248, in __init__
    validation_format=validation_format))
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/reader/configurator_run.py", line 70, in __init__
    self.original_runhistory = self.reader.get_runhistory(self.scen.cs)
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/reader/smac3_reader.py", line 55, in get_runhistory
    rh.load_json(rh_fn, cs)
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/smac/runhistory/runhistory.py", line 351, in load_json
    for id_, values in all_data["configs"].items()}
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/smac/runhistory/runhistory.py", line 351, in <dictcomp>
    for id_, values in all_data["configs"].items()}
  File "ConfigSpace/configuration_space.py", line 1010, in ConfigSpace.configuration_space.Configuration.__init__
ValueError: Trying to set illegal value 'True' forhyperparameter boolean, Type: Categorical, Choices: {True, False}, Default: True.
ERROR:cave.cavefacade.CAVE:Trying to set illegal value 'True' forhyperparameter boolean, Type: Categorical, Choices: {True, False}, Default: True.
Traceback (most recent call last):
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/cavefacade.py", line 248, in __init__
    validation_format=validation_format))
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/reader/configurator_run.py", line 70, in __init__
    self.original_runhistory = self.reader.get_runhistory(self.scen.cs)
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/reader/smac3_reader.py", line 55, in get_runhistory
    rh.load_json(rh_fn, cs)
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/smac/runhistory/runhistory.py", line 351, in load_json
    for id_, values in all_data["configs"].items()}
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/smac/runhistory/runhistory.py", line 351, in <dictcomp>
    for id_, values in all_data["configs"].items()}
  File "ConfigSpace/configuration_space.py", line 1010, in ConfigSpace.configuration_space.Configuration.__init__
ValueError: Trying to set illegal value 'True' forhyperparameter boolean, Type: Categorical, Choices: {True, False}, Default: True.
ERROR:cave.cavefacade.CAVE:Trying to set illegal value 'True' forhyperparameter boolean, Type: Categorical, Choices: {True, False}, Default: True.
Traceback (most recent call last):
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/cavefacade.py", line 248, in __init__
    validation_format=validation_format))
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/reader/configurator_run.py", line 70, in __init__
    self.original_runhistory = self.reader.get_runhistory(self.scen.cs)
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/reader/smac3_reader.py", line 55, in get_runhistory
    rh.load_json(rh_fn, cs)
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/smac/runhistory/runhistory.py", line 351, in load_json
    for id_, values in all_data["configs"].items()}
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/smac/runhistory/runhistory.py", line 351, in <dictcomp>
    for id_, values in all_data["configs"].items()}
  File "ConfigSpace/configuration_space.py", line 1010, in ConfigSpace.configuration_space.Configuration.__init__
ValueError: Trying to set illegal value 'True' forhyperparameter boolean, Type: Categorical, Choices: {True, False}, Default: True.
ERROR:cave.cavefacade.CAVE:Trying to set illegal value 'False' forhyperparameter boolean, Type: Categorical, Choices: {True, False}, Default: True.
Traceback (most recent call last):
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/cavefacade.py", line 248, in __init__
    validation_format=validation_format))
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/reader/configurator_run.py", line 70, in __init__
    self.original_runhistory = self.reader.get_runhistory(self.scen.cs)
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/reader/smac3_reader.py", line 55, in get_runhistory
    rh.load_json(rh_fn, cs)
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/smac/runhistory/runhistory.py", line 351, in load_json
    for id_, values in all_data["configs"].items()}
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/smac/runhistory/runhistory.py", line 351, in <dictcomp>
    for id_, values in all_data["configs"].items()}
  File "ConfigSpace/configuration_space.py", line 1010, in ConfigSpace.configuration_space.Configuration.__init__
ValueError: Trying to set illegal value 'False' forhyperparameter boolean, Type: Categorical, Choices: {True, False}, Default: True.
Traceback (most recent call last):
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/bin/cave", line 11, in <module>
    load_entry_point('cave==1.1.4.dev0', 'console_scripts', 'cave')()
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/cave_cli.py", line 312, in entry_point
    cave.main_cli()
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/cave_cli.py", line 292, in main_cli
    verbose_level=verbose_level)
  File "/home/eggenspk/anaconda3/envs/dl4ac_36/lib/python3.6/site-packages/cave-1.1.4.dev0-py3.6.egg/cave/cavefacade.py", line 255, in __init__
    raise ValueError("None of the specified folders could be loaded.")
ValueError: None of the specified folders could be loaded.

I modified the following:

Adding to example_mlp_on_digits.py:

boolean = CSH.CategoricalHyperparameter('boolean', [True, False])
config_space.add_hyperparameter(boolean)

and changing the first cell in the Python notebook to:

import os

from ConfigSpace.read_and_write import json
import hpbandster.core.nameserver as hpns
import hpbandster.core.result as hpres
from hpbandster.optimizers import BOHB as BOHB

# Create and save a configuration space
config_space = get_configspace()
out_dir = 'example_mlp_on_digits'
os.makedirs(out_dir, exist_ok=True)
#with open(os.path.join(out_dir, 'configspace.pcs'), 'w') as fh:
#    fh.write(pcs_new.write(config_space))
    
with open(os.path.join(out_dir, 'configspace.json'), 'w') as fh:
    fh.write(json.write(config_space))

@shukon, @PhMueller: Could you please quickly check whether you get the same? To me it looks like the problem is rather reading the runhistory file than reading the configspace?

KEggensperger avatar Jan 14 '19 14:01 KEggensperger

Hey @KEggensperger, I'm sorry, I somehow lost my focus on this issue and I missed the pings. Can you please check, if it's still an issue with the current dev-branch of cave? (this should be fixed in smac 0.10.0, and cave now enforces that version). Sorry for the delay!

shukon avatar Jan 21 '19 19:01 shukon

@shukon Thanks a lot, it works for most scenarios. For some it crashes with ValueError: Out of range float values are not JSON compliant. My best guess for that would be the following: Do you know whether it can handle NaN as a loss value, e.g. if the run crashed?

KEggensperger avatar Jan 22 '19 09:01 KEggensperger

@KEggensperger actually I think CAVE is currently ignoring runs where None-values are returned... NaNs might produce problems, can you either

  • provide me with data producing the errors (MWE) or
  • check if it still happens with this branch (pip install git+https://github.com/automl/CAVE.git@FIX-nan-losses)

shukon avatar Jan 22 '19 10:01 shukon

Also, do you think it's ok to skip None's and NaN's or should they be processed somehow?

shukon avatar Jan 22 '19 10:01 shukon

Yes, that fix works perfectly fine.

I am not sure how to handle NaN, None, inf correctly. Maybe they should be replaced with a bad value (=high loss). Ideally, users can change this via an argument and the default is to ignore them. Otherwise, it would be great if the overview shows how many runs are ignored. You could, e.g., drop one of

# Runs per Config (min)
# Runs per Config (mean)
# Runs per Config (max)

and replace with

# Dropped runs with non-finite cost

Feel free to close this issue or leave it open as a reminder.

KEggensperger avatar Jan 22 '19 11:01 KEggensperger

I suppose it's difficult to determine the bad value - or do you have an idea for that? In that case I could add them to runhistory with status CRASHED. That would be smooth. But I don't know how I can guarantee that the loss is actually bad enough...

shukon avatar Jan 22 '19 12:01 shukon

You can't automatically guess the bad value. A somewhat reasonable heuristic could be 2 x the worst seen/possible value, e.g. for (1-accuracy) this would be 2. But that depends on the application and what causes NaN values and therefore needs to be defined by the user.

In my opinion all solutions are fine as long as it is documented somewhere and shown either as a log message or in the report.

KEggensperger avatar Jan 22 '19 13:01 KEggensperger