nni
nni copied to clipboard
Retiarii - How To Get Final Model
I am following your tutorial at https://nni.readthedocs.io/en/latest/tutorials/hello_nas.html and it runs fine, but I don't understand how to get the final results unless I am watching the experiment while it is running. I checked the logs at /content/nni/ec90fnpg/log/nnimanager.log but they don't seem to show any accuracy info or what the best model is. I just want to know what the accuracy score and the final model was, without having to watch it run.
I can export the top model as you describe at https://nni.readthedocs.io/en/latest/tutorials/hello_nas.html#export-top-models but it seems like that would only show while the experiment is active, and either way I am not sure how to then use that info to train the model.
An alternative would be for me to use "nnictl view" to directly grab and parse the info for the experiment while it is running but RetiariiExperiment can't be viewed with nnictl view (see https://github.com/microsoft/nni/issues/4743).
I simply want to run Retiarii to get an accuracy score and get a trained model (or an export of the model parameters which I can then train). I would think that is what almost all users of an NAS program would want, but after hours or trying I can't figure out how to do that.
@impulsecorp thanks for raising this issue. The summary of the problem is very clear. We will support view and resume of Retiarii experiment in v2.9. For the upcoming v2.8, you can simply add one line input('press any button to exit')
at the end of your python script to block the exit of this python script. In this case, WebUI keeps alive, you can view and operate your experiment on it.
That would be a good feature to add, but it still seems like there should be an easier way for me to use Retiarii to get the accuracy score and a trained model, without having to use nnictl (even using nnictl view I am not sure exactly how I would do that).
I simply want to do what a typical Retiarii user would do (train a model and get the accuracy score, so I can later deploy it), but I am still unclear how to do that. Is there a better way to do what I need?
First, for resume/view, you can use either python API (e.g., RetiariiExperiment.resume(...), RetiariiExperiment.view(...)) or nnictl, which will be supported in v2.9.
Second, after the best model is exported (it is a json object when you use the default execution engine), you can follow this API (https://nni.readthedocs.io/en/stable/reference/nas/others.html#retrain-architecture-evaluation) to instantiate the model specified by the exported json.
What you said about searching, retraining, and deployment is common needs, NNI keeps improving this pipeline, welcome contributions, requirements, and suggestions from all of you.
Calling exp.view('q2owf8cx') on the RetiariiExperiment instance gives me this error:
====================
AttributeError Traceback (most recent call last)
4 frames /usr/local/lib/python3.7/dist-packages/nni/experiment/experiment.py in view(experiment_id, port, non_blocking) 231 If false, run in the foreground. If true, run in the background. 232 """ --> 233 experiment = Experiment._view(experiment_id) 234 experiment.start(port=port, debug=False) 235 if non_blocking:
/usr/local/lib/python3.7/dist-packages/nni/experiment/experiment.py in _view(exp_id, exp_dir) 257 exp.id = exp_id 258 exp._action = 'view' --> 259 exp.config = launcher.get_stopped_experiment_config(exp_id, exp_dir) 260 return exp 261
/usr/local/lib/python3.7/dist-packages/nni/experiment/launcher.py in get_stopped_experiment_config(exp_id, exp_dir) 267 def get_stopped_experiment_config(exp_id, exp_dir=None): 268 config_json = get_stopped_experiment_config_json(exp_id, exp_dir) # type: ignore --> 269 config = ExperimentConfig(**config_json) # type: ignore 270 if exp_dir and not os.path.samefile(exp_dir, config.experiment_working_directory): 271 msg = 'Experiment working directory provided in command line (%s) is different from experiment config (%s)'
/usr/local/lib/python3.7/dist-packages/nni/experiment/config/experiment_config.py in init(self, training_service_platform, **kwargs)
84
85 def init(self, training_service_platform=None, **kwargs):
---> 86 super().init(**kwargs)
87 if training_service_platform is not None:
88 # the user chose to init with config = ExperimentConfig('local')
and set fields later
/usr/local/lib/python3.7/dist-packages/nni/experiment/config/base.py in init(self, **kwargs) 91 class_name = type(self).name 92 fields = ', '.join(args.keys()) ---> 93 raise AttributeError(f'{class_name} does not have field(s) {fields}') 94 95 # try to unpack nested config
AttributeError: ExperimentConfig does not have field(s) executionengine
I checked and there is a variable execution_engine but it seems that it is looking for one without the underscore somewhere.
@peter-ch view of RetiariiExperiment will formally supported in v2.9