nni
nni copied to clipboard
Error about running with framecontroller mode
config.yml
experimentName: example_mnist_pytorch
trialConcurrency: 1
maxExecDuration: 1h
maxTrialNum: 2
debug: false
nniManagerIp: 172.16.40.155
#choice: local, remote, pai, kubeflow
trainingServicePlatform: frameworkcontroller
searchSpacePath: search_space.json
#choice: true, false
useAnnotation: false
tuner:
#choice: TPE, Random, Anneal, Evolution, BatchTuner, MetisTuner, GPTuner
builtinTunerName: TPE
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
trial:
codeDir: .
taskRoles:
- name: worker
taskNum: 1
command: python3 model.py
gpuNum: 0
cpuNum: 1
memoryMB: 8192
image: frameworkcontroller/nni:v1.0
securityContext:
privileged: true
frameworkAttemptCompletionPolicy:
minFailedTaskCount: 1
minSucceededTaskCount: 1
frameworkcontrollerConfig:
storage: nfs
serviceAccountName: frameworkcontroller
nfs:
# Your NFS server IP, like 10.10.10.10
server:
nfs server: /etc/exports /nfs/nni *(rw,no_root_squash,sync,insecure,no_subtree_check)
k8s logs
kubectl describe pod nniexpywn5cftxenvma3hh-worker-0:
How to solve this error?
I find a question:
the dictionary nni/ywn5cftx does not exist in the shared dictionary :nfs/nni :