nni icon indicating copy to clipboard operation
nni copied to clipboard

Error about running with framecontroller mode

Open N-Kingsley opened this issue 2 years ago • 1 comments

config.yml

experimentName: example_mnist_pytorch trialConcurrency: 1 maxExecDuration: 1h maxTrialNum: 2 debug: false nniManagerIp: 172.16.40.155 #choice: local, remote, pai, kubeflow trainingServicePlatform: frameworkcontroller searchSpacePath: search_space.json #choice: true, false useAnnotation: false tuner: #choice: TPE, Random, Anneal, Evolution, BatchTuner, MetisTuner, GPTuner builtinTunerName: TPE classArgs: #choice: maximize, minimize optimize_mode: maximize trial: codeDir: . taskRoles: - name: worker taskNum: 1 command: python3 model.py gpuNum: 0 cpuNum: 1 memoryMB: 8192 image: frameworkcontroller/nni:v1.0 securityContext: privileged: true frameworkAttemptCompletionPolicy: minFailedTaskCount: 1 minSucceededTaskCount: 1 frameworkcontrollerConfig: storage: nfs serviceAccountName: frameworkcontroller nfs: # Your NFS server IP, like 10.10.10.10 server: # Your NFS server export path, like /var/nfs/nni path: /nfs/nni readOnly: false

nfs server: /etc/exports /nfs/nni *(rw,no_root_squash,sync,insecure,no_subtree_check)

k8s logs

kubectl describe pod nniexpywn5cftxenvma3hh-worker-0: image

How to solve this error?

N-Kingsley avatar May 19 '22 08:05 N-Kingsley

I find a question: the dictionary nni/ywn5cftx does not exist in the shared dictionary :nfs/nni : image

N-Kingsley avatar May 20 '22 08:05 N-Kingsley