InputChoice raises issues with TPE search strategy in NAS

Open sw33zy opened this issue 2 years ago • 1 comments

Describe the issue: In version 3.0rc1 TPE does not seem to be compatible with the InputChoice primitive. In addition, I found TPE tuner is default to minimize (I'm assuming because hpo tuners all minimize), but this isn't coherent with https://github.com/microsoft/nni/issues/5626#issuecomment-1615350440 , perhaps it should be initialized with optimize_mode='maximize'.

Relevant code:

    class Block(nn.Module):
        def __init__(...) -> None:
            super().__init__()
           (...)
            self.skip_connection = InputChoice(n_candidates=2, n_chosen=1,
                                               label='block_skip_connection' + str(index))

        def forward(self, x: Tensor) -> Tensor:
            x_input = x
            (...)
            x = self.skip_connection([x, x + x_input])
            return x

Raises:

[2023-08-03 17:59:50] Starting web server...
[2023-08-03 17:59:51] Setting up...
[2023-08-03 17:59:52] Web portal URLs: http://169.254.250.24:9120 http://10.0.0.1:9120 http://192.168.56.1:9120 http://169.254.81.199:9120 http://169.254.164.10:9120 http://169.254.20.238:9120 http://169.254.178.165:9120 http://192.168.1.68:9120 http://169.254.160.23:9120 http://10.0.4.52:9120 http://169.254.168.66:9120 http://127.0.0.1:9120
[2023-08-03 17:59:52] WARNING: Cannot convert CategoricalMultiple([0, 1], n_chosen=1, label='dnn/block_skip_connection0') to legacy format. It will not show on WebUI.
[2023-08-03 17:59:52] WARNING: Cannot convert CategoricalMultiple([0, 1], n_chosen=1, label='dnn/block_skip_connection1') to legacy format. It will not show on WebUI.
[2023-08-03 17:59:52] WARNING: Cannot convert CategoricalMultiple([0, 1], n_chosen=1, label='dnn/block_skip_connection2') to legacy format. It will not show on WebUI.
[2023-08-03 17:59:52] WARNING: Cannot convert CategoricalMultiple([0, 1], n_chosen=1, label='dnn/block_skip_connection3') to legacy format. It will not show on WebUI.
[2023-08-03 17:59:52] WARNING: Cannot convert CategoricalMultiple([0, 1], n_chosen=1, label='dnn/block_skip_connection4') to legacy format. It will not show on WebUI.
[2023-08-03 17:59:52] Successfully update searchSpace.
[2023-08-03 17:59:52] Experiment initialized successfully. Starting exploration strategy...
[2023-08-03 17:59:52] ERROR: Strategy failed to execute.
[2023-08-03 17:59:52] Stopping experiment, please wait...
Traceback (most recent call last):
  File "C:\Users\Leonardo\Documents\Universidade Leo\5º ano\tese\Omnia\omnia\omnia\examples\single_drug.py", line 97, in <module>
    nni_predictor.fit(refit_best=True)
  File "C:\Users\Leonardo\Documents\Universidade Leo\5º ano\tese\Omnia\omnia\omnia\src\omnia\generics\nas\nni_predictor.py", line 309, in fit
    experiment.start_experiment()
  File "C:\Users\Leonardo\Documents\Universidade Leo\5º ano\tese\Omnia\omnia\omnia\src\omnia\generics\nas\experiment.py", line 107, in start_experiment
    experiment.run(port=self.port)
  File "C:\Users\Leonardo\AppData\Local\pypoetry\Cache\virtualenvs\omnia-local-1fEoJYjW-py3.9\lib\site-packages\nni\experiment\experiment.py", line 236, in run
    return self._run_impl(port, wait_completion, debug)
  File "C:\Users\Leonardo\AppData\Local\pypoetry\Cache\virtualenvs\omnia-local-1fEoJYjW-py3.9\lib\site-packages\nni\experiment\experiment.py", line 205, in _run_impl
    self.start(port, debug)
  File "C:\Users\Leonardo\AppData\Local\pypoetry\Cache\virtualenvs\omnia-local-1fEoJYjW-py3.9\lib\site-packages\nni\nas\experiment\experiment.py", line 270, in start
    self._start_engine_and_strategy()
  File "C:\Users\Leonardo\AppData\Local\pypoetry\Cache\virtualenvs\omnia-local-1fEoJYjW-py3.9\lib\site-packages\nni\nas\experiment\experiment.py", line 230, in _start_engine_and_strategy
    self.strategy.run()
  File "C:\Users\Leonardo\AppData\Local\pypoetry\Cache\virtualenvs\omnia-local-1fEoJYjW-py3.9\lib\site-packages\nni\nas\strategy\base.py", line 170, in run
    self._run()
  File "C:\Users\Leonardo\AppData\Local\pypoetry\Cache\virtualenvs\omnia-local-1fEoJYjW-py3.9\lib\site-packages\nni\nas\strategy\hpo.py", line 69, in _run
    tuner_search_space = {label: mutable.as_legacy_dict() for label, mutable in self.model_space.simplify().items()}
  File "C:\Users\Leonardo\AppData\Local\pypoetry\Cache\virtualenvs\omnia-local-1fEoJYjW-py3.9\lib\site-packages\nni\nas\strategy\hpo.py", line 69, in <dictcomp>
    tuner_search_space = {label: mutable.as_legacy_dict() for label, mutable in self.model_space.simplify().items()}
  File "C:\Users\Leonardo\AppData\Local\pypoetry\Cache\virtualenvs\omnia-local-1fEoJYjW-py3.9\lib\site-packages\nni\mutable\mutable.py", line 356, in as_legacy_dict
    raise NotImplementedError(f'as_legacy_dict is not implemented for this type of mutable: {type(self)}.')
NotImplementedError: as_legacy_dict is not implemented for this type of mutable: <class 'nni.mutable.mutable.CategoricalMultiple'>.
[2023-08-03 17:59:52] Experiment stopped

Process finished with exit code 1

Environment:

NNI version: 3.0rc1
Training service (local|remote|pai|aml|etc): local
Client OS: windows 10
Python version: 3.9.13
PyTorch/TensorFlow version: 1.13.0
Is conda/virtualenv/venv used?: pypoetry
Is running in Docker?: No

Aug 03 '23 17:08 sw33zy

CategoricalMultiple is not supported for TPE strategy.

But I think we can do better by handling the special case where n_chosen=1 for CategoricalMultiple in as_legacy_dict(). The as_legacy_dict() was meant to be temporary at the moment when it was implemented. But right now as it's becoming long-term, it might be worth the effort to implement the trick.

Aug 17 '23 04:08 matluster