ipyparallel icon indicating copy to clipboard operation
ipyparallel copied to clipboard

'.ipython/profile_ssh' not found

Open xiedidan opened this issue 3 years ago • 4 comments

I'm trying to start cluster with ssh. The following error message appears in engine log:

2021-12-08 00:42:19.562 [IPEngine] Config changed: {'ProfileDir': {'location': '.ipython/profile_ssh'}, 'IPEngine': {'work_dir': '/home/xd/project/Finance/quant_v1', 'profile': 'ssh'}, 'Session': {'key': b'8dc49daa-41cace1936b400470864d3d2', 'signature_scheme': 'hmac-sha256', 'packer': 'json', 'unpacker': 'json'}, 'IPKernelApp': {'exec_lines': [], 'exec_files': []}, 'HistoryManager': {'hist_file': ':memory:'}}
2021-12-08 00:42:19.562 [IPEngine] CRITICAL | Profile directory '.ipython/profile_ssh' not found.

It looks like I should set absolute path for 'ProfileDir': {'location': '.ipython/profile_ssh'}, but I don't know how to do that.
I've tried setting c.ProfileDir.location = '/home/xd/.ipython/profile_ssh' in ipcluster_config.py with no luck... Thanks.

xiedidan avatar Dec 07 '21 16:12 xiedidan

Can you include more code to reproduce the issue? Have you created the profile already (ipython profile create ssh)?

minrk avatar Dec 08 '21 07:12 minrk

Can you include more code to reproduce the issue? Have you created the profile already (ipython profile create ssh)?

profile_ssh.tar.gz

profile_ssh_218.tar.gz

Yes profile_ssh is created (profile_ssh.tar.gz). I'm trying to run controller on 192.168.5.71 and engines both on 192.168.5.71 and 192.168.5.218. Profile dir has been copied to 218 automatically (please look into profile_ssh_218.tar.gz). And controller tried to start engines on 218, but all the engines failed with '.ipython/profile_ssh' not found error.

In jupyterlab I have following code to start a cluster:

import ipyparallel as ipp
cluster = ipp.Cluster(profile_dir='/home/xd/.ipython/profile_ssh', cluster_id='')
client = cluster.start_and_connect_sync()

And cell outputs:

Starting 8 engines with <class 'ipyparallel.cluster.launcher.SSHEngineSetLauncher'>
[ProfileCreate] Generating default config file: '.ipython/profile_ssh/ipython_config.py'
[ProfileCreate] Generating default config file: '.ipython/profile_ssh/ipython_kernel_config.py'
ensuring remote 192.168.5.218:.ipython/profile_ssh/security/ exists
sending /home/xd/.ipython/profile_ssh/security/ipcontroller-client.json to 192.168.5.218:.ipython/profile_ssh/security/ipcontroller-client.json
ensuring remote 192.168.5.218:.ipython/profile_ssh/security/ exists
sending /home/xd/.ipython/profile_ssh/security/ipcontroller-engine.json to 192.168.5.218:.ipython/profile_ssh/security/ipcontroller-engine.json
Running `/home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh`
Running `/home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh`
Running `/home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh`
Running `/home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh`
Running `/home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh`
Running `/home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh`
Running `/home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh`
Running `/home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh`
fetching /tmp/tmpv0virpoo/ipengine-1638895319.8639.out from 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895319.8639.out
Removing 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895319.8639.out
fetching /tmp/tmpcvb5rpr7/ipengine-1638895323.9537.out from 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895323.9537.out
Removing 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895323.9537.out
fetching /tmp/tmpado3ikz7/ipengine-1638895327.8639.out from 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895327.8639.out
Removing 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895327.8639.out
fetching /tmp/tmp_fmgo0tg/ipengine-1638895331.7824.out from 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895331.7824.out
Removing 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895331.7824.out
fetching /tmp/tmpwr984gkf/ipengine-1638895336.0938.out from 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895336.0938.out
Removing 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895336.0938.out
fetching /tmp/tmpv1iuy8zz/ipengine-1638895340.0645.out from 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895340.0645.out
Removing 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895340.0645.out
fetching /tmp/tmpbxqoopnj/ipengine-1638895344.0524.out from 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895344.0524.out
Removing 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895344.0524.out
fetching /tmp/tmpcpimll00/ipengine-1638895348.0458.out from 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895348.0458.out
Removing 192.168.5.218:.ipython/profile_ssh/log/ipengine-1638895348.0458.out
engine set stopped 1638895313: {'engines': {'192.168.5.218/0': {'exit_code': -1, 'pid': 26085, 'identifier': '192.168.5.218/0'}, '192.168.5.218/1': {'exit_code': -1, 'pid': 26305, 'identifier': '192.168.5.218/1'}, '192.168.5.218/2': {'exit_code': -1, 'pid': 26528, 'identifier': '192.168.5.218/2'}, '192.168.5.218/3': {'exit_code': -1, 'pid': 26750, 'identifier': '192.168.5.218/3'}, '192.168.5.218/4': {'exit_code': -1, 'pid': 26973, 'identifier': '192.168.5.218/4'}, '192.168.5.218/5': {'exit_code': -1, 'pid': 27193, 'identifier': '192.168.5.218/5'}, '192.168.5.218/6': {'exit_code': -1, 'pid': 27415, 'identifier': '192.168.5.218/6'}, '192.168.5.218/7': {'exit_code': -1, 'pid': 27637, 'identifier': '192.168.5.218/7'}}, 'exit_code': -1}

I have to quickly cat ipengine-xxx.xxx.out files since they got removed right after process exits.
I tried to run /home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh manaually on 218, it looks things goes well in this way:

(base) xd@xd-supermicro:~$ /home/xd/anaconda3/envs/finance/bin/python -m ipyparallel.engine --work-dir=/home/xd/project/Finance/quant_v1 --profile=ssh
2021-12-08 22:07:52.112 [IPEngine] IPYTHONDIR set to: /home/xd/.ipython
2021-12-08 22:07:52.114 [IPEngine] Using existing profile dir: '/home/xd/.ipython/profile_ssh'
2021-12-08 22:07:52.115 [IPEngine] Searching path ['/home/xd', '/home/xd/.ipython/profile_ssh', '/home/xd/anaconda3/envs/finance/etc/ipython', '/usr/local/etc/ipython', '/etc/ipython'] for config files
2021-12-08 22:07:52.116 [IPEngine] Attempting to load config file: ipython_config.py
2021-12-08 22:07:52.116 [IPEngine] Looking for ipython_config in /etc/ipython
2021-12-08 22:07:52.116 [IPEngine] Looking for ipython_config in /usr/local/etc/ipython
2021-12-08 22:07:52.116 [IPEngine] Looking for ipython_config in /home/xd/anaconda3/envs/finance/etc/ipython
2021-12-08 22:07:52.116 [IPEngine] Looking for ipython_config in /home/xd/.ipython/profile_ssh
2021-12-08 22:07:52.118 [IPEngine] Loaded config file: /home/xd/.ipython/profile_ssh/ipython_config.py
2021-12-08 22:07:52.118 [IPEngine] Looking for ipython_config in /home/xd
2021-12-08 22:07:52.119 [IPEngine] Attempting to load config file: ipengine_config.py
2021-12-08 22:07:52.120 [IPEngine] Looking for ipengine_config in /etc/ipython
2021-12-08 22:07:52.120 [IPEngine] Looking for ipengine_config in /usr/local/etc/ipython
2021-12-08 22:07:52.120 [IPEngine] Looking for ipengine_config in /home/xd/anaconda3/envs/finance/etc/ipython
2021-12-08 22:07:52.120 [IPEngine] Looking for ipengine_config in /home/xd/.ipython/profile_ssh
2021-12-08 22:07:52.120 [IPEngine] Looking for ipengine_config in /home/xd
2021-12-08 22:07:52.125 [IPEngine] Changing to working dir: /home/xd/project/Finance/quant_v1
2021-12-08 22:07:52.126 [IPEngine] Loading connection file '/home/xd/.ipython/profile_ssh/security/ipcontroller-engine.json'
2021-12-08 22:07:52.138 [IPEngine] WARNING | Not using CurveZMQ security
2021-12-08 22:07:52.141 [IPEngine] Config changed:
2021-12-08 22:07:52.141 [IPEngine] {'IPEngine': {'work_dir': '/home/xd/project/Finance/quant_v1', 'profile': 'ssh'}, 'Session': {'key': b'8dc49daa-41cace1936b400470864d3d2', 'signature_scheme': 'hmac-sha256', 'packer': 'json', 'unpacker': 'json'}}
2021-12-08 22:07:52.143 [IPEngine] Registering with controller at tcp://192.168.5.71:50813
2021-12-08 22:07:52.149 [IPEngine] Shell_addrs: ['tcp://192.168.5.71:33927', 'tcp://192.168.5.71:40413', 'tcp://192.168.5.71:41163']
2021-12-08 22:07:52.150 [IPEngine] Setting shell identity b'0a2083d1-cf141668cbaa7ca3c7048532'
2021-12-08 22:07:52.150 [IPEngine] Connecting shell to tcp://192.168.5.71:33927
2021-12-08 22:07:52.150 [IPEngine] Connecting shell to tcp://192.168.5.71:40413
2021-12-08 22:07:52.151 [IPEngine] Connecting shell to tcp://192.168.5.71:41163
2021-12-08 22:07:52.151 [IPEngine] Starting nanny
2021-12-08 22:07:53.082 [KernelNanny.8] Starting kernel nanny for engine 8, pid=7245, nanny pid=7250
2021-12-08 22:07:53.087 [KernelNanny.8] Nanny watching parent pid 7245.
2021-12-08 22:07:53.098 [IPEngine] Seeing logger to stderr, rerouting to raw filedescriptor.
2021-12-08 22:07:53.172 [IPEngine] Config changed: {'IPEngine': {'work_dir': '/home/xd/project/Finance/quant_v1', 'profile': 'ssh'}, 'Session': {'key': b'8dc49daa-41cace1936b400470864d3d2', 'signature_scheme': 'hmac-sha256', 'packer': 'json', 'unpacker': 'json'}, 'IPKernelApp': {'exec_lines': [], 'exec_files': []}, 'HistoryManager': {'hist_file': ':memory:'}}
2021-12-08 22:07:53.173 [IPEngine] IPYTHONDIR set to: /home/xd/.ipython
2021-12-08 22:07:53.175 [IPEngine] Using existing profile dir: '/home/xd/.ipython/profile_default'
2021-12-08 22:07:53.179 [IPEngine] WARNING | debugpy_stream undefined, debugging will not be enabled
2021-12-08 22:07:53.183 [IPEngine] Starting to monitor the heartbeat signal from the hub every 3500 ms.
2021-12-08 22:07:53.184 [IPEngine] Completed registration with id 8

Now I'm stuck. As you can see from engine logs, engine started by controller automatically sets 'ProfileDir': {'location': '.ipython/profile_ssh'}, while manual started engine does not. I think that might be the problem but I don't know how to solve it.

xiedidan avatar Dec 08 '21 14:12 xiedidan

I'm sorry, I was sure I wrote a reply to this some time ago.

I believe the crux is the combination of relative profile directory and work directory, so it is probably looking for the profile in /home/xd/project/Finance/quant_v1/.ipython/profile_ssh instead of $HOME/.ipython/profile_ssh.

If you can get away with not specifying work_dir (e.g. os.chdir at the beginning of your code), I bet it will work while we figure out what to fix.

minrk avatar Dec 22 '21 10:12 minrk

I have to quickly cat ipengine-xxx.xxx.out files since they got removed right after process exits.

These files are removed because the output is already retrieved. You can view it with cluster.engine_set.get_output(). This should be easier to discover!

minrk avatar Dec 22 '21 10:12 minrk