liuzhe-lz comments

Results 24 comments of


                                            liuzhe-lz

nni installation problem on miniforge3 mac M1

~~NNI does not support ARM platform. Please use x86 Python with Rosetta.~~ ~~Or you can try to build from source, by changing `x64` [here](https://github.com/microsoft/nni/blob/master/setup_ts.py#L79) to `arm64`. NNI code itself is...

nni installation problem on miniforge3 mac M1

> > > NNI does not support ARM platform. Please use x86 Python with Rosetta. > > > Or you can try to build from source, by changing `x64` [here](https://github.com/microsoft/nni/blob/master/setup_ts.py#L79)...

torch.cuda.is_available() changes to FALSE at the second trial

Please print `os.environ['CUDA_VISIBLE_DEVICES']` to log and tell us its value.

How to set useActivateGpu=true in remote mode?

It is a per-machine config for remote mode. ```yaml maxTrialNumber: 20 trialCommand: python main.py trialCodeDirectory: . trialGpuNumber: 2 trialConcurrency: 4 tuner: name: TPE classArgs: optimize_mode: maximize trainingService: platform: remote reuseMode:...

Having trouble nni with frameworkcontroller on k8s

NNI v2.9 has been released.

my experiment has error: ERROR (NNIManager) Dispatcher error: tuner_command_channel: Tuner closed connection

How often does your trial code report intermediate result? If your epoch is short, you can try to report intermediate result per 10 epochs.

The hardcoded timeout value sometimes will cause connection error

We'll consider add this feature in next release. You can open a PR if you want.

Log mechanism discussion

Discussion iteration 1 conclusions: ### Log files Each HPO experiment writes 3 files: - ~/nni-experiments/EXPERIMENT-ID/logs/experiment.log - ~/nni-experiments/EXPERIMENT-ID/logs/nnimanager.log - ~/nni-experiments/EXPERIMENT-ID/logs/dispatcher.log Each NAS multi-trial experiment writes 2 files: - ~/nni-experiments/EXPERIMENT-ID/logs/experiment.log - ~/nni-experiments/EXPERIMENT-ID/logs/nnimanager.log...

SQLITE_IOERR when setting up rest server

Please check you have sufficient disk space and have permission to write in ~/nni-experiments/ directory.

SQLITE_IOERR when setting up rest server

``` [2022-06-23 02:11:31] ERROR (NNIManager) Dispatcher error: read ECONNRESET [2022-06-23 02:11:31] ERROR (NNIManager) Error: Dispatcher stream error, tuner may have crashed. ``` Seems there's something wrong in tuner. What's the...