sparkmagic
sparkmagic copied to clipboard
Unable to connect to Spark or PySpark kernels when start a notebook use jupyterhub with yarnspawner
This is a weird issue. We are runing a cluster with Hadoop, Spark, Livy and Jupyterhub. Sparkmagic works correctly when start a notebook with CLI (jupyter notebook --ip:**). Then we package the conda enviornment and localize this python enviornment to each container. The following configuration have been added in jupyterhub_config.py
c.Spawner.notebook_dir = '$(pwd)'
c.YarnSpawner.localize_files = {
'environment': {
'source': 'hdfs://***:8020/jupyterhub/environments/environment3.tar.gz',
'visibility': 'public'
},
'sparkmagic': {
'source': 'hdfs://***:8020/jupyterhub/environments/sparkmagic.tar.gz',
'visibility': 'public'
}
}
However, on creating a new notebook with either Spark or PySpark kernels on a container this kernel never starts.
[D 2019-08-05 19:09:22.133 YarnSingleUserNotebookApp manager:257] Starting kernel: ['/data12/hadoop/yarn/local/usercache/fanxin8/appcache/application_1564654299855_0053/container_e119_1564654299855_0053_01_000001/environment/bin/python', '-m', 'sparkmagic.kernels.sparkkernel.sparkkernel', '-f', './.jupyter/kernel-e1946d5a-94eb-4f8b-8001-470d8164e65a.json']
[D 2019-08-05 19:09:22.155 YarnSingleUserNotebookApp connect:542] Connecting to: tcp://127.0.0.1:41577
[D 2019-08-05 19:09:22.156 YarnSingleUserNotebookApp connect:542] Connecting to: tcp://127.0.0.1:61583
[I 2019-08-05 19:09:22.158 YarnSingleUserNotebookApp kernelmanager:172] Kernel started: e1946d5a-94eb-4f8b-8001-470d8164e65a
[D 2019-08-05 19:09:22.158 YarnSingleUserNotebookApp kernelmanager:173] Kernel args: {'kernel_name': 'sparkkernel', 'cwd': '/data12/hadoop/yarn/local/usercache/fanxin8/appcache/application_1564654299855_0053/container_e119_1564654299855_0053_01_000001'}
[I 2019-08-05 19:09:22.160 YarnSingleUserNotebookApp log:174] 201 POST /user/fanxin8/api/sessions (fanxin8@::ffff:10.112.115.253) 31.18ms
[D 2019-08-05 19:09:22.162 YarnSingleUserNotebookApp auth:857] Allowing Hub admin fanxin8
[D 2019-08-05 19:09:22.162 YarnSingleUserNotebookApp handlers:18] Serving kernel resource from: /data12/hadoop/yarn/local/usercache/fanxin8/appcache/application_1564654299855_0053/container_e119_1564654299855_0053_01_000001/environment/share/jupyter/kernels/sparkkernel
[D 2019-08-05 19:09:22.163 YarnSingleUserNotebookApp log:174] 200 GET /user/fanxin8/kernelspecs/sparkkernel/kernel.js?v=20190805185419 (fanxin8@::ffff:10.112.115.253) 2.42ms
[D 2019-08-05 19:09:22.199 YarnSingleUserNotebookApp zmqhandlers:296] Initializing websocket connection /user/fanxin8/api/kernels/e1946d5a-94eb-4f8b-8001-470d8164e65a/channels
[D 2019-08-05 19:09:22.201 YarnSingleUserNotebookApp auth:857] Allowing Hub admin fanxin8
[D 2019-08-05 19:09:22.201 YarnSingleUserNotebookApp handlers:139] Requesting kernel info from e1946d5a-94eb-4f8b-8001-470d8164e65a
[D 2019-08-05 19:09:22.201 YarnSingleUserNotebookApp connect:542] Connecting to: tcp://127.0.0.1:56087
/data12/hadoop/yarn/local/usercache/fanxin8/appcache/application_1564654299855_0053/container_e119_1564654299855_0053_01_000001/environment/lib/python3.6/site-packages/IPython/paths.py:68: UserWarning: IPython parent '/home' is not a writable location, using a temp directory.
" using a temp directory.".format(parent))
[D 2019-08-05 19:09:23.331 YarnSingleUserNotebookApp kernelmanager:419] activity on e1946d5a-94eb-4f8b-8001-470d8164e65a: stream
[D 2019-08-05 19:09:23.332 YarnSingleUserNotebookApp kernelmanager:419] activity on e1946d5a-94eb-4f8b-8001-470d8164e65a: stream
[D 2019-08-05 19:09:23.332 YarnSingleUserNotebookApp kernelmanager:419] activity on e1946d5a-94eb-4f8b-8001-470d8164e65a: stream
[D 2019-08-05 19:09:23.333 YarnSingleUserNotebookApp kernelmanager:419] activity on e1946d5a-94eb-4f8b-8001-470d8164e65a: stream
[I 2019-08-05 19:09:25.158 YarnSingleUserNotebookApp restarter:110] KernelRestarter: restarting kernel (1/5), new random ports
[D 2019-08-05 19:09:25.160 YarnSingleUserNotebookApp manager:257] Starting kernel: ['/data12/hadoop/yarn/local/usercache/fanxin8/appcache/application_1564654299855_0053/container_e119_1564654299855_0053_01_000001/environment/bin/python', '-m', 'sparkmagic.kernels.sparkkernel.sparkkernel', '-f', './.jupyter/kernel-e1946d5a-94eb-4f8b-8001-470d8164e65a.json']
[D 2019-08-05 19:09:25.181 YarnSingleUserNotebookApp connect:542] Connecting to: tcp://127.0.0.1:32991
/data12/hadoop/yarn/local/usercache/fanxin8/appcache/application_1564654299855_0053/container_e119_1564654299855_0053_01_000001/environment/lib/python3.6/site-packages/IPython/paths.py:68: UserWarning: IPython parent '/home' is not a writable location, using a temp directory.
" using a temp directory.".format(parent))
[I 2019-08-05 19:09:28.185 YarnSingleUserNotebookApp restarter:110] KernelRestarter: restarting kernel (2/5), new random ports
[D 2019-08-05 19:09:28.186 YarnSingleUserNotebookApp manager:257] Starting kernel: ['/data12/hadoop/yarn/local/usercache/fanxin8/appcache/application_1564654299855_0053/container_e119_1564654299855_0053_01_000001/environment/bin/python', '-m', 'sparkmagic.kernels.sparkkernel.sparkkernel', '-f', './.jupyter/kernel-e1946d5a-94eb-4f8b-8001-470d8164e65a.json']
[D 2019-08-05 19:09:28.208 YarnSingleUserNotebookApp connect:542] Connecting to: tcp://127.0.0.1:28586
/data12/hadoop/yarn/local/usercache/fanxin8/appcache/application_1564654299855_0053/container_e119_1564654299855_0053_01_000001/environment/lib/python3.6/site-packages/IPython/paths.py:68: UserWarning: IPython parent '/home' is not a writable location, using a temp directory.
" using a temp directory.".format(parent))
[I 2019-08-05 19:09:31.212 YarnSingleUserNotebookApp restarter:110] KernelRestarter: restarting kernel (3/5), new random ports
[D 2019-08-05 19:09:31.213 YarnSingleUserNotebookApp manager:257] Starting kernel: ['/data12/hadoop/yarn/local/usercache/fanxin8/appcache/application_1564654299855_0053/container_e119_1564654299855_0053_01_000001/environment/bin/python', '-m', 'sparkmagic.kernels.sparkkernel.sparkkernel', '-f', './.jupyter/kernel-e1946d5a-94eb-4f8b-8001-470d8164e65a.json']
[D 2019-08-05 19:09:31.235 YarnSingleUserNotebookApp connect:542] Connecting to: tcp://127.0.0.1:31821
/data12/hadoop/yarn/local/usercache/fanxin8/appcache/application_1564654299855_0053/container_e119_1564654299855_0053_01_000001/environment/lib/python3.6/site-packages/IPython/paths.py:68: UserWarning: IPython parent '/home' is not a writable location, using a temp directory.
" using a temp directory.".format(parent))
[I 2019-08-05 19:09:34.239 YarnSingleUserNotebookApp restarter:110] KernelRestarter: restarting kernel (4/5), new random ports
[D 2019-08-05 19:09:34.241 YarnSingleUserNotebookApp manager:257] Starting kernel: ['/data12/hadoop/yarn/local/usercache/fanxin8/appcache/application_1564654299855_0053/container_e119_1564654299855_0053_01_000001/environment/bin/python', '-m', 'sparkmagic.kernels.sparkkernel.sparkkernel', '-f', './.jupyter/kernel-e1946d5a-94eb-4f8b-8001-470d8164e65a.json']
[D 2019-08-05 19:09:34.263 YarnSingleUserNotebookApp connect:542] Connecting to: tcp://127.0.0.1:22746
/data12/hadoop/yarn/local/usercache/fanxin8/appcache/application_1564654299855_0053/container_e119_1564654299855_0053_01_000001/environment/lib/python3.6/site-packages/IPython/paths.py:68: UserWarning: IPython parent '/home' is not a writable location, using a temp directory.
" using a temp directory.".format(parent))
[W 2019-08-05 19:09:37.264 YarnSingleUserNotebookApp restarter:100] KernelRestarter: restart failed
[W 2019-08-05 19:09:37.264 YarnSingleUserNotebookApp kernelmanager:135] Kernel e1946d5a-94eb-4f8b-8001-470d8164e65a died, removing from map.
[D 2019-08-05 19:09:47.228 YarnSingleUserNotebookApp singleuser:503] Notifying Hub of activity 2019-08-05T11:09:23.332902Z
[W 2019-08-05 19:10:22.237 YarnSingleUserNotebookApp handlers:229] Timeout waiting for kernel_info reply from e1946d5a-94eb-4f8b-8001-470d8164e65a
[I 2019-08-05 19:10:22.239 YarnSingleUserNotebookApp log:174] 101 GET /user/fanxin8/api/kernels/e1946d5a-94eb-4f8b-8001-470d8164e65a/channels?session_id=d7c0ce7b072b424b86548ee9fbf44436 (fanxin8@::ffff:10.112.115.253) 60039.59ms
[D 2019-08-05 19:10:22.239 YarnSingleUserNotebookApp zmqhandlers:157] Opening websocket /user/fanxin8/api/kernels/e1946d5a-94eb-4f8b-8001-470d8164e65a/channels
[D 2019-08-05 19:10:22.239 YarnSingleUserNotebookApp kernelmanager:248] Getting buffer for e1946d5a-94eb-4f8b-8001-470d8164e65a
[E 2019-08-05 19:10:22.239 YarnSingleUserNotebookApp handlers:276] Error opening stream: HTTP 404: Not Found (Kernel does not exist: e1946d5a-94eb-4f8b-8001-470d8164e65a)
[D 2019-08-05 19:10:22.321 YarnSingleUserNotebookApp handlers:294] Received message on closed websocket '{"header":{"msg_id":"d5d66ddc95c04bdda0df55535fc22cde","username":"username","session":"d7c0ce7b072b424b86548ee9fbf44436","msg_type":"kernel_info_request","version":"5.2"},"metadata":{},"content":{},"buffers":[],"parent_header":{},"channel":"shell"}'
[D 2019-08-05 19:10:22.321 YarnSingleUserNotebookApp handlers:294] Received message on closed websocket '{"header":{"msg_id":"1028682aebb2481b8d2dd87e1074ccfa","username":"username","session":"d7c0ce7b072b424b86548ee9fbf44436","msg_type":"comm_info_request","version":"5.2"},"metadata":{},"content":{"target_name":"jupyter.widget"},"buffers":[],"parent_header":{},"channel":"shell"}'
[D 2019-08-05 19:10:22.321 YarnSingleUserNotebookApp handlers:429] Websocket closed e1946d5a-94eb-4f8b-8001-470d8164e65a:d7c0ce7b072b424b86548ee9fbf44436
I meet the issue that when I start the jupyterhub and sign in,I choose the spark kernel and print spark in the cell,it shows that name spark is not defined.Did you meet the same issue?
I meet the issue that when I start the jupyterhub and sign in,I choose the spark kernel and print spark in the cell,it shows that name spark is not defined.Did you meet the same issue?
In my situation, all of the problems is about the environment variables of spark-magic and autovizwidget. The value of HOME_PATH insparkmagic/autovizwidget/autovizwidget/utils/constants.py is ~/.autovizwidget which is unchangeable. On the other hand, the process do not have permissions to make or change this path. Finally, this weird issue occurs. According to your description, maybe the conda enviornment you packaged is not correct.
changed the following in sparkmagic/autovizwidget/autovizwidget/utils/constants.py HOME_PATH = os.environ.get("AUTOVIZWIDGET_CONF_DIR", "~/.autovizwidget")