ParameterImportance
ParameterImportance copied to clipboard
Broken fANOVA
Running on the spear-qcp example (from SMAC3)
$ python ~/git/ParameterImportance/scripts/evaluate.py --scenario_file smac3-output_2017-05-12_12\:53\:31_\(750603\)_run1/scenario.txt --history smac3-output_2017-05-12_12\:53\:31_\(750603\)_run1/runhistory.json --trajectory smac3-output_2017-05-12_12\:53\:31_\(750603\)_run1/traj_aclib2.json --num_params 10 --modus all
[...]
INFO:fANOVA:PREPROCESSING PREPROCESSING PREPROCESSING PREPROCESSING PREPROCESSING PREPROCESSING
INFO:fANOVA:Finished Preprocessing
Traceback (most recent call last):
File "/home/lindauer/git/ParameterImportance/scripts/evaluate.py", line 41, in <module>
result = importance.evaluate_scenario(args.modus)
File "/home/lindauer/git/ParameterImportance/pimp/importance/importance.py", line 240, in evaluate_scenario
self.evaluator = method
File "/home/lindauer/git/ParameterImportance/pimp/importance/importance.py", line 160, in evaluator
to_evaluate=self._parameters_to_evaluate)
File "/home/lindauer/git/ParameterImportance/pimp/evaluator/fanova.py", line 31, in __init__
self.evaluator = fanova_pyrfr(X=self.X, Y=self.y.flatten(), config_space=cs, config_on_hypercube=True)
TypeError: __init__() got an unexpected keyword argument 'config_on_hypercube'
@mlindauer: Did you install fANOVA as listed in the requirements file, i.e. git+http://github.com/automl/fanova@952c9bd46b47cde87036c00f974629c9e5819565 (I'm still working on getting a tag for fanova so we don't have to reference commit numbers) If so this shouldn't happen. If you've installed it manually, then you've installed the wrong branch. The pyrfr_reimplementation branch is what we need here.
yes, indeed, I used the wrong branch. After reinstalling fANOVA, I'm getting a new error now:
INFO:Importance:Running evaluation method fANOVA
Traceback (most recent call last):
File "/home/lindauer/git/ParameterImportance/scripts/evaluate.py", line 41, in <module>
result = importance.evaluate_scenario(args.modus)
File "/home/lindauer/git/ParameterImportance/pimp/importance/importance.py", line 247, in evaluate_scenario
return {evaluation_method: self.evaluator.run()}
File "/home/lindauer/git/ParameterImportance/pimp/evaluator/fanova.py", line 76, in run
idx, param.name, self.evaluator.quantify_importance([idx])[(idx, )]['total importance']))
File "/home/lindauer/anaconda3/lib/python3.6/site-packages/fanova/fanova.py", line 280, in quantify_importance
[self.V_U_total[sub_dims][t] / self.trees_total_variance[t] for t in range(self.n_trees)])
File "/home/lindauer/anaconda3/lib/python3.6/site-packages/fanova/fanova.py", line 280, in <listcomp>
[self.V_U_total[sub_dims][t] / self.trees_total_variance[t] for t in range(self.n_trees)])
ZeroDivisionError: float division by zero
This mean that one of the trees doesn't have any variance. Which can happen in the following scenarios:
- you have cutoff all the data
- all the data points have the same performance
- you only have 1 data points
None the less, the error shouldn't happen, and this issue actually belongs into the fANOVA. Could you guys try to take a snapshot of the data fed into the fANOVA to reproduce that on my side and fix it. Thanks!