jupyterlab-integration icon indicating copy to clipboard operation
jupyterlab-integration copied to clipboard

I successfully connect to my cluster on jupyter lab, but I run some test code, the notebook has no response

Open lognat0704 opened this issue 3 years ago • 8 comments

Hi DataBricks team,

I successfully connect to my cluster on jupyter lab, I try do run some test codes but has no response. And I am sure I can ssh to my cluster. How do I fix this??

Screen Shot 2021-09-02 at 12 41 09 AM Screen Shot 2021-09-02 at 12 41 30 AM Screen Shot 2021-09-02 at 1 01 43 AM

lognat0704 avatar Sep 01 '21 15:09 lognat0704

I also try troubleshooting methods such as reinstall my conda and ssh $cluster, but still not working...

Screen Shot 2021-09-02 at 7 43 01 PM Screen Shot 2021-09-02 at 7 48 49 PM

nelsontseng0704 avatar Sep 02 '21 14:09 nelsontseng0704

@nelsontseng0704 It looks like something with your environment variables is different than expected. Do you use the -e or --env argument, as in dj $PROFILE -k -e ...? How does your command line look like to create the kernelspec (something like dj $PROFILE -k ...)? For example, if I use dj westeu -k -e VAR1=VAR2=2 then I get a similar error:

Traceback (most recent call last):
  File "/opt/miniconda/envs/dj/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/miniconda/envs/dj/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/opt/miniconda/envs/dj/lib/python3.8/site-packages/databrickslabs_jupyterlab/__main__.py", line 65, in <module>
    main(args.host, connection_info, args.python, args.s, args.timeout, args.env, args.no_spark)
  File "/opt/miniconda/envs/dj/lib/python3.8/site-packages/databrickslabs_jupyterlab/__main__.py", line 19, in main
    kernel = DatabricksKernel(host, conn_info, python_path, sudo, timeout, env, no_spark)
  File "/opt/miniconda/envs/dj/lib/python3.8/site-packages/databrickslabs_jupyterlab/kernel.py", line 37, in __init__
    self.dbjl_env = dict([e.split("=") for e in env[0].split(" ")])
ValueError: dictionary update sequence element #4 has length 3; 2 is required

Example of a successful session:

  1. create a Jupyter kernelspec
$ dj westeu -k -e VAR1=1 VAR2=2
Valid version of conda detected: 4.10.1

* Getting host and token from .databrickscfg

* Select remote cluster

? Which cluster to connect to? 0: 'tiny-8.1' (id: 0726-093206-cams456, state: RUNNING, workers: 1)
   => Selected cluster: tiny-8.1 (20.103.251.219:2200)

* Configuring ssh config for remote cluster
   => ~/.ssh/config will be changed
   => A backup of the current ~/.ssh/config has been created
   => at ~/.databrickslabs_jupyterlab/ssh_config_backup/config.2021-09-02_21-32-13
   => Jupyterlab Integration made the following changes to /Users/bernhard/.ssh/config:
  Host ...
  ...
   => Known hosts fingerprint added for 20.xxx.xxx.xxx:2200

   => Testing whether cluster can be reached
   => OK

* Installing driver libraries
   => Installing  ipywidgets==7.6.4 ipykernel==5.5.5 databrickslabs-jupyterlab==2.2.1 pygments>=2.4.1
   => OK

* Creating remote kernel spec
args.extra_env None
   => Creating kernel specification for profile 'westeu'
   => Kernel specification 'SSH 0123-45678-cams456 westeu:tiny-8.1 (dj/Spark)' created or updated
   => OK

* Setting global config of jupyter lab (autorestart, timeout)
   => OK

You should then see something like

[I 2021-09-02 21:37:44.876 ServerApp] Kernel started: 0e8e4f8e-9a73-47f0-90f1-28cfff868ce1
[W 2021-09-02 21:37:48.514 ServerApp] Got events for closed stream None
[I 2021-09-02 21:37:54.852 ServerApp] Kernel shutdown: 0e8e4f8e-9a73-47f0-90f1-28cfff868ce1
[I 2021-09-02 21:37:55.193 ServerApp] Kernel started: 27153ec4-d67d-4c4a-b18c-b6edd6645281
[I 21:37:56.440 DatabricksKernel] Gateway created for cluster '0123-45678-cams456'

[I 21:37:59.213 DatabricksKernel] Remote python path: /local_disk0/pythonVirtualEnvDirs/virtualEnv-904399e5-eda8-4acf-ab14-f38071da34dd
[I 21:37:59.227 DatabricksKernel] Creating remote connection info
[I 21:37:59.774 DatabricksKernel] Setting up ssh tunnels
[I 21:37:59.775 DatabricksKernel] Starting remote kernel
[I 21:37:59.828 DatabricksKernel] Remote kernel is alive
[W 21:38:01.834 DatabricksKernel] Warning: Timeout waiting for output

Remote init: profile=westeu, organisation= ... . cluster_id=0123-45678-cams456, host=https://adb- ... .azuredatabricks.net/
Remote init: Spark UI = https://adb- ... .13.azuredatabricks.net//?o= ... #/setting/clusters/0123-45678-cams456/sparkUi
Remote init: Connected
Remote init: Spark Session created
Remote init: Configuring mlflow
Cannot initialize mlflow
Remote init: Configuring scala Command
Remote init: The following global variables have been created:
- spark       Spark session
- sc          Spark context
- sqlContext  Hive Context
- dbutils     Databricks utilities (filesystem access only)
- dbbrowser   Allows to browse dbfs and databases:
              - dbbrowser.dbfs()
              - dbbrowser.databases()

There is no need to call dbcontext any more. The Spark context is immediately available in the notebook.

bernhard-42 avatar Sep 02 '21 19:09 bernhard-42

Thanks for the reply.

The command I use is dj nelson -k, not dj nelson -k -e And then I type in jupyter lab and the browser will popup the jupyter lab IDE.

nelsontseng0704 avatar Sep 03 '21 04:09 nelsontseng0704

Even I try dj rsa -l, still encounter same ValueError: dictionary update sequence element #4 has length 3; 2 is required error.

(dj2) nelsontseng@Nelsons-MBP ~ % dj rsa -l     
Valid version of conda detected: 4.10.3

* Getting host and token from .databrickscfg

* Select remote cluster

? Which cluster to connect to? 0: 'Creed' (id: 0903-035924-wooer7, state: RUNNING, workers: 2)
   => Selected cluster: Creed (ec2-18-237-210-172.us-west-2.compute.amazonaws.com:2200)

* Configuring ssh config for remote cluster
   => ~/.ssh/config will be changed
   => A backup of the current ~/.ssh/config has been created
   => at ~/.databrickslabs_jupyterlab/ssh_config_backup/config.2021-09-03_12-59-16
   => Jupyterlab Integration made the following changes to /Users/nelsontseng/.ssh/config:
  
  Host 0824-041225-sprat561
      HostName ec2-34-222-70-116.us-west-2.compute.amazonaws.com
      IdentityFile ~/.ssh/id_nelson
      Port 2200
      User ubuntu
      ServerAliveInterval 30
      ServerAliveCountMax 5760
      ConnectTimeout 5
  Host 0903-035924-wooer7
      HostName ec2-18-237-210-172.us-west-2.compute.amazonaws.com
      IdentityFile ~/.ssh/id_rsa
      Port 2200
      User ubuntu
      ServerAliveInterval 30
      ServerAliveCountMax 5760
      ConnectTimeout 5
   => Known hosts fingerprint added for ec2-18-237-210-172.us-west-2.compute.amazonaws.com:2200

   => Testing whether cluster can be reached
   => OK

* Installing driver libraries
   => Installing  ipywidgets==7.6.4 ipykernel==5.5.5 databrickslabs-jupyterlab==2.2.1 pygments>=2.4.1
   => OK
[I 2021-09-03 12:59:25.009 ServerApp] databrickslabs_jupyterlab | extension was successfully linked.
[I 2021-09-03 12:59:25.018 ServerApp] jupyterlab | extension was successfully linked.
[I 2021-09-03 12:59:25.031 ServerApp] nbclassic | extension was successfully linked.
[I 2021-09-03 12:59:25.031 ServerApp] ssh_ipykernel | extension was successfully linked.
[I 2021-09-03 12:59:25.087 ServerApp] nbclassic | extension was successfully loaded.
[I 2021-09-03 12:59:25.089 ServerApp] databrickslabs_jupyterlab | extension was successfully loaded.
[I 2021-09-03 12:59:25.090 LabApp] JupyterLab extension loaded from /Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/site-packages/jupyterlab
[I 2021-09-03 12:59:25.090 LabApp] JupyterLab application directory is /Users/nelsontseng/opt/anaconda3/envs/dj2/share/jupyter/lab
[I 2021-09-03 12:59:25.094 ServerApp] jupyterlab | extension was successfully loaded.
[I 2021-09-03 12:59:25.094 ServerApp] ssh_ipykernel | extension was successfully loaded.
[I 2021-09-03 12:59:25.095 ServerApp] Serving notebooks from local directory: /Users/nelsontseng
[I 2021-09-03 12:59:25.095 ServerApp] Jupyter Server 1.10.2 is running at:
[I 2021-09-03 12:59:25.095 ServerApp] http://localhost:8888/lab?token=387c44c71f79fe7b68c42c266791d6256ad812d1425dd580
[I 2021-09-03 12:59:25.095 ServerApp]  or http://127.0.0.1:8888/lab?token=387c44c71f79fe7b68c42c266791d6256ad812d1425dd580
[I 2021-09-03 12:59:25.095 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 2021-09-03 12:59:25.102 ServerApp] 
    
    To access the server, open this file in a browser:
        file:///Users/nelsontseng/Library/Jupyter/runtime/jpserver-46160-open.html
    Or copy and paste one of these URLs:
        http://localhost:8888/lab?token=387c44c71f79fe7b68c42c266791d6256ad812d1425dd580
     or http://127.0.0.1:8888/lab?token=387c44c71f79fe7b68c42c266791d6256ad812d1425dd580
[W 2021-09-03 12:59:25.393 ServerApp] 404 GET /api/kernels/844060cb-e44e-4031-8ac0-8e13aadf996c/channels?session_id=983ef136-49bf-4724-b14e-01c6854c9a42 (::1): Kernel does not exist: 844060cb-e44e-4031-8ac0-8e13aadf996c
[W 2021-09-03 12:59:25.414 ServerApp] 404 GET /api/kernels/844060cb-e44e-4031-8ac0-8e13aadf996c/channels?session_id=983ef136-49bf-4724-b14e-01c6854c9a42 (::1) 24.98ms referer=None
[W 2021-09-03 12:59:28.424 LabApp] Could not determine jupyterlab build status without nodejs
[W 2021-09-03 12:59:28.743 ServerApp] 404 GET /api/kernels/844060cb-e44e-4031-8ac0-8e13aadf996c/channels?session_id=d20a6ad3-d152-4245-beea-0f1564129231 (::1): Kernel does not exist: 844060cb-e44e-4031-8ac0-8e13aadf996c
[W 2021-09-03 12:59:28.745 ServerApp] 404 GET /api/kernels/844060cb-e44e-4031-8ac0-8e13aadf996c/channels?session_id=d20a6ad3-d152-4245-beea-0f1564129231 (::1) 6.68ms referer=None
[I 2021-09-03 12:59:34.340 ServerApp] Creating new notebook in 
[I 2021-09-03 12:59:34.620 ServerApp] Kernel started: b294e363-c3c5-449f-80e2-aead3655bb35
Traceback (most recent call last):
  File "/Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/site-packages/databrickslabs_jupyterlab/__main__.py", line 65, in <module>
    main(args.host, connection_info, args.python, args.s, args.timeout, args.env, args.no_spark)
  File "/Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/site-packages/databrickslabs_jupyterlab/__main__.py", line 19, in main
    kernel = DatabricksKernel(host, conn_info, python_path, sudo, timeout, env, no_spark)
  File "/Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/site-packages/databrickslabs_jupyterlab/kernel.py", line 37, in __init__
    self.dbjl_env = dict([e.split("=") for e in env[0].split(" ")])
ValueError: dictionary update sequence element #1 has length 3; 2 is required
[W 2021-09-03 12:59:54.646 ServerApp] Timeout waiting for kernel_info reply from b294e363-c3c5-449f-80e2-aead3655bb35
[W 2021-09-03 12:59:55.377 ServerApp] Replacing stale connection: 844060cb-e44e-4031-8ac0-8e13aadf996c:983ef136-49bf-4724-b14e-01c6854c9a42

nelsontseng0704 avatar Sep 03 '21 05:09 nelsontseng0704

Could you maybe add into the file "/Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/site-packages/databrickslabs_jupyterlab/kernel.py" before line 37 a logging command so thatv it looks like this:

self._logger.info("ENV: %s", env) 
self.dbjl_env = dict([e.split("=") for e in env[0].split(" ")])

and run dj rsa -l again? I would like to see what this variable contains.

bernhard-42 avatar Sep 03 '21 16:09 bernhard-42

Thanks for your kindly help again. The bellowed image is screen shot of "/Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/site-packages/databrickslabs_jupyterlab/kernel.py"

Screen Shot 2021-09-04 at 2 19 02 AM

Here is message after I input dj rsa -l

? Which cluster to connect to? 0: 'Creed' (id: 0903-035924-wooer7, state: RUNNING, workers: 1)
   => Selected cluster: Creed (ec2-52-36-246-201.us-west-2.compute.amazonaws.com:2200)

* Configuring ssh config for remote cluster
   => ~/.ssh/config will be changed
   => A backup of the current ~/.ssh/config has been created
   => at ~/.databrickslabs_jupyterlab/ssh_config_backup/config.2021-09-04_02-29-02
   => Jupyterlab Integration made the following changes to /Users/nelsontseng/.ssh/config:
  
  Host 0824-041225-sprat561
      HostName ec2-34-222-70-116.us-west-2.compute.amazonaws.com
      IdentityFile ~/.ssh/id_nelson
      Port 2200
      User ubuntu
      ServerAliveInterval 30
      ServerAliveCountMax 5760
      ConnectTimeout 5
  Host 0903-035924-wooer7
-     HostName ec2-18-237-210-172.us-west-2.compute.amazonaws.com
?                  ^^  ^^   -----

+     HostName ec2-52-36-246-201.us-west-2.compute.amazonaws.com
?                  ^^^^^  ^^  +

      IdentityFile ~/.ssh/id_rsa
      Port 2200
      User ubuntu
      ServerAliveInterval 30
      ServerAliveCountMax 5760
      ConnectTimeout 5
   => Known hosts fingerprint added for ec2-52-36-246-201.us-west-2.compute.amazonaws.com:2200

   => Testing whether cluster can be reached
   => OK

* Installing driver libraries
   => Installing  ipywidgets==7.6.4 ipykernel==5.5.5 databrickslabs-jupyterlab==2.2.1 pygments>=2.4.1
   => OK
[I 2021-09-04 02:29:13.522 ServerApp] databrickslabs_jupyterlab | extension was successfully linked.
[I 2021-09-04 02:29:13.535 ServerApp] jupyterlab | extension was successfully linked.
[I 2021-09-04 02:29:13.555 ServerApp] nbclassic | extension was successfully linked.
[I 2021-09-04 02:29:13.555 ServerApp] ssh_ipykernel | extension was successfully linked.
[I 2021-09-04 02:29:13.631 ServerApp] nbclassic | extension was successfully loaded.
[I 2021-09-04 02:29:13.639 ServerApp] databrickslabs_jupyterlab | extension was successfully loaded.
[I 2021-09-04 02:29:13.640 LabApp] JupyterLab extension loaded from /Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/site-packages/jupyterlab
[I 2021-09-04 02:29:13.640 LabApp] JupyterLab application directory is /Users/nelsontseng/opt/anaconda3/envs/dj2/share/jupyter/lab
[I 2021-09-04 02:29:13.644 ServerApp] jupyterlab | extension was successfully loaded.
[I 2021-09-04 02:29:13.644 ServerApp] ssh_ipykernel | extension was successfully loaded.
[I 2021-09-04 02:29:13.645 ServerApp] Serving notebooks from local directory: /Users/nelsontseng
[I 2021-09-04 02:29:13.645 ServerApp] Jupyter Server 1.10.2 is running at:
[I 2021-09-04 02:29:13.645 ServerApp] http://localhost:8888/lab?token=d648a7040769c5263f31ab7ed43a59cba796c261ab28ba2d
[I 2021-09-04 02:29:13.645 ServerApp]  or http://127.0.0.1:8888/lab?token=d648a7040769c5263f31ab7ed43a59cba796c261ab28ba2d
[I 2021-09-04 02:29:13.645 ServerApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 2021-09-04 02:29:13.655 ServerApp] 
    
    To access the server, open this file in a browser:
        file:///Users/nelsontseng/Library/Jupyter/runtime/jpserver-91990-open.html
    Or copy and paste one of these URLs:
        http://localhost:8888/lab?token=d648a7040769c5263f31ab7ed43a59cba796c261ab28ba2d
     or http://127.0.0.1:8888/lab?token=d648a7040769c5263f31ab7ed43a59cba796c261ab28ba2d
[W 2021-09-04 02:29:17.180 LabApp] Could not determine jupyterlab build status without nodejs
[I 2021-09-04 02:29:18.132 ServerApp] Kernel started: 2ad7c9d8-d72c-4594-a7e2-e4c3dc903c4b
[I 2021-09-04 02:29:18.150 ServerApp] Kernel started: 53633217-108d-4d7d-8ac3-c6d108218ae9
[I 02:29:18.554 DatabricksKernel] ENV: ['DBJL_PROFILE=rsa DBJL_HOST=https://dbc-f7271877-b294.cloud.databricks.com/?o=4887657523819851 DBJL_CLUSTER=0903-035924-wooer7']
Traceback (most recent call last):
  File "/Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/site-packages/databrickslabs_jupyterlab/__main__.py", line 65, in <module>
    main(args.host, connection_info, args.python, args.s, args.timeout, args.env, args.no_spark)
  File "/Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/site-packages/databrickslabs_jupyterlab/__main__.py", line 19, in main
    kernel = DatabricksKernel(host, conn_info, python_path, sudo, timeout, env, no_spark)
  File "/Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/site-packages/databrickslabs_jupyterlab/kernel.py", line 37, in __init__
    self.dbjl_env = dict([e.split("=") for e in env[0].split(" ")])
ValueError: dictionary update sequence element #1 has length 3; 2 is required
Traceback (most recent call last):
  File "/Users/nelsontseng/opt/anaconda3/envs/dj/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/nelsontseng/opt/anaconda3/envs/dj/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/nelsontseng/opt/anaconda3/envs/dj/lib/python3.8/site-packages/databrickslabs_jupyterlab/__main__.py", line 65, in <module>
    main(args.host, connection_info, args.python, args.s, args.timeout, args.env, args.no_spark)
  File "/Users/nelsontseng/opt/anaconda3/envs/dj/lib/python3.8/site-packages/databrickslabs_jupyterlab/__main__.py", line 19, in main
    kernel = DatabricksKernel(host, conn_info, python_path, sudo, timeout, env, no_spark)
  File "/Users/nelsontseng/opt/anaconda3/envs/dj/lib/python3.8/site-packages/databrickslabs_jupyterlab/kernel.py", line 37, in __init__
    self.dbjl_env = dict([e.split("=") for e in env[0].split(" ")])
ValueError: dictionary update sequence element #1 has length 3; 2 is required
[W 2021-09-04 02:29:38.949 ServerApp] Timeout waiting for kernel_info reply from 53633217-108d-4d7d-8ac3-c6d108218ae9
[I 2021-09-04 02:29:39.856 ServerApp] Kernel shutdown: 53633217-108d-4d7d-8ac3-c6d108218ae9
[I 2021-09-04 02:29:39.879 ServerApp] Kernel started: ff10c72a-d643-4e62-b34c-3c9d12fa98cf
[I 02:29:40.301 DatabricksKernel] ENV: ['DBJL_PROFILE=rsa DBJL_HOST=https://dbc-f7271877-b294.cloud.databricks.com/?o=4887657523819851 DBJL_CLUSTER=0903-035924-wooer7']
Traceback (most recent call last):
  File "/Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/site-packages/databrickslabs_jupyterlab/__main__.py", line 65, in <module>
    main(args.host, connection_info, args.python, args.s, args.timeout, args.env, args.no_spark)
  File "/Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/site-packages/databrickslabs_jupyterlab/__main__.py", line 19, in main
    kernel = DatabricksKernel(host, conn_info, python_path, sudo, timeout, env, no_spark)
  File "/Users/nelsontseng/opt/anaconda3/envs/dj2/lib/python3.8/site-packages/databrickslabs_jupyterlab/kernel.py", line 37, in __init__
    self.dbjl_env = dict([e.split("=") for e in env[0].split(" ")])
ValueError: dictionary update sequence element #1 has length 3; 2 is required
[W 2021-09-04 02:29:41.709 ServerApp] Timeout waiting for kernel_info reply from 2ad7c9d8-d72c-4594-a7e2-e4c3dc903c4b
[W 2021-09-04 02:29:47.699 ServerApp] Nudge: attempt 10 on kernel 2ad7c9d8-d72c-4594-a7e2-e4c3dc903c4b
[W 2021-09-04 02:29:54.221 ServerApp] Nudge: attempt 20 on kernel 2ad7c9d8-d72c-4594-a7e2-e4c3dc903c4b

nelsontseng0704 avatar Sep 03 '21 18:09 nelsontseng0704

OK, I think this is the issue: DBJL_HOST=https://dbc-f7271877-b294.cloud.databricks.com/?o=4887657523819851. It has two = signs, which I don't take care of.

I guess, you have something like

[rsa]
host = https://dbc-f7271877-b294.cloud.databricks.com/?o=4887657523819851
token = ...

in your ~/.databrickscfg, right?

The o=4887657523819851 should not be necessary. In fact, I even think on AWS is no org id, this only exists for Azure.

Could you please try:

[rsa]
host = https://dbc-f7271877-b294.cloud.databricks.com
token = ...

bernhard-42 avatar Sep 03 '21 19:09 bernhard-42

Problem Solved! Thx!!

nelsontseng0704 avatar Sep 07 '21 15:09 nelsontseng0704