ipex-llm
ipex-llm copied to clipboard
"PalProcessExit: Returning exit code 1 "when run JupyterLab in trusted-bigdata contianer locally.
I try to run JupyterLab in trusted-bigdata contianer locally. But the jupyter service failed to start normally. I use the following commands.
export KEYS_PATH=/root/BigDL23/BigDL/ppml/keys/
export LOCAL_IP=*.*.*.*
export DOCKER_IMAGE=intelanalytics/bigdl-ppml-trusted-bigdata-gramine-reference-8g:2.4.0-SNAPSHOT
sudo docker run -itd \
--net=host \
--cpus=8 \
--oom-kill-disable \
--device=/dev/sgx/enclave \
--device=/dev/sgx/provision \
--name=jupyter \
-v /var/run/aesmd/aesm.socket:/var/run/aesmd/aesm.socket \
-v $KEYS_PATH:/ppml/keys \
-e RUNTIME_DRIVER_PORT=54321 \
-e RUNTIME_DRIVER_MEMORY=16g \
-e LOCAL_IP=$LOCAL_IP \
$DOCKER_IMAGE bash
echo "export JUPYTER_RUNTIME_DIR=$JUPYTER_RUNTIME_DIR && \
export JUPYTER_DATA_DIR=$JUPYTER_DATA_DIR && \
usr/local/bin/jupyter-lab notebook \
--notebook-dir=/ppml/apps \
--ip=0.0.0.0 \
--port=8889 \
--no-browser \
--allow-root" >> temp_command_file
export sgx_command="bash temp_command_file"
gramine-sgx bash 2>&1 | tee /ppml/jupyter-notebook.log
I successfully start the gramine,but I got a more than 21000 lines log. And it contains the following information:
PermissionError: [Errno 13] Permission denied: '/root/.local'
But when I check the folder,the contianer don't have this folder.Even if I create a folder like this it doesn't work. And at the end of the log contains these messages:
[P1:libos] debug: IPC worker: received IPC message from 15: code=17 size=21 seq=2
[P1:libos] debug: clearing POSIX locks for pid 15
[P1:libos] debug: Sending ipc message to 15
[P15:libos] debug: IPC worker: received IPC message from 1: code=0 size=21 seq=2
[P15:libos] debug: Got an IPC response from 1, seq: 2
[P15:T15:python3] debug: Waiting finished: 0
[P15:T15:python3] debug: Sending ipc message to 1
[P15:T15:python3] debug: sync client shutdown: closing handles
[P15:T15:python3] debug: sync client shutdown: waiting for confirmation
[P15:T15:python3] debug: sync client shutdown: finished
[P15:T15:python3] debug: ipc_release_id_range: sending a request: [15..15]
[P15:T15:python3] debug: Sending ipc message to 1
[P15:T15:python3] debug: ipc_release_id_range: ipc_send_message: 0
[P1:libos] debug: IPC worker: received IPC message from 15: code=2 size=37 seq=0
[P15:libos] debug: IPC worker: exiting worker thread
[P1:libos] debug: IPC callback from 15: IPC_MSG_CHILDEXIT(1, 15, 1, 0)
[P1:libos] debug: Child process (pid: 15) died
[P15:T15:python3] debug: process 15 exited with status 1
debug: PalProcessExit: Returning exit code 1
[P1:T1:bash] trace: ---- return from wait4(...) = 0xf
[P1:T1:bash] trace: ---- rt_sigaction([SIGINT], 0x3c4d68ae0, 0x3c4d68b80, 0x8) = 0x0
[P1:libos] debug: IPC worker: received IPC message from 15: code=4 size=25 seq=0
[P1:libos] debug: ipc_release_id_range_callback: release_id_range(15..15)
[P1:T1:bash] trace: ---- ioctl(2, TIOCGWINSZ, 0x3c4d68d60) ...
[P1:T1:bash] trace: ---- return from ioctl(...) = -38
[P1:T1:bash] trace: ---- rt_sigprocmask(SETMASK, [], NULL, 0x8) = 0x0
[P1:T1:bash] debug: Created sigframe for sig: 17 at 0x3c4d68090 (handler: 0x3c50b3a70, restorer: 0x3c4e2eb40)
[P1:T1:bash] trace: ---- wait4(-1, 0x3c4d68020, WNOHANG, 0) ...
[P1:T1:bash] trace: ---- return from wait4(...) = -10
[P1:T1:bash] trace: ---- rt_sigreturn()
[P1:T1:bash] trace: ---- read(255, 0x3c51e2a40, 0xb97) ...
[P1:T1:bash] trace: ---- return from read(...) = 0x0
[P1:T1:bash] trace: ---- rt_sigprocmask(BLOCK, [SIGCHLD,], [], 0x8) = 0x0
[P1:T1:bash] trace: ---- rt_sigprocmask(SETMASK, [], NULL, 0x8) = 0x0
[P1:T1:bash] debug: ---- exit_group (returning 1)
[P1:T1:bash] debug: clearing POSIX locks for pid 1
[P1:T1:bash] debug: sync client shutdown: closing handles
[P1:T1:bash] debug: sync client shutdown: waiting for confirmation
[P1:T1:bash] debug: sync client shutdown: finished
[P1:libos] debug: IPC worker: exiting worker thread
[P1:T1:bash] debug: process 1 exited with status 1
debug: PalProcessExit: Returning exit code 1
Then the gramine closed and failed to run the jupyterlab. Why does this happen?
Maybe you don't set JUPYTER_RUNTIME_DIR
and JUPYTER_DATA_DIR
, and the 'usr/local/' should be /usr/local
. I try this command and successfully start jupyter service locally.
cd /ppml
export JUPYTER_RUNTIME_DIR=/ppml/jupyter/runtime
export JUPYTER_DATA_DIR=/ppml/jupyter/data
bash init.sh
echo "export JUPYTER_RUNTIME_DIR=$JUPYTER_RUNTIME_DIR && \
export JUPYTER_DATA_DIR=$JUPYTER_DATA_DIR && \
/usr/local/bin/jupyter-lab notebook \
--notebook-dir=/ppml/apps \
--ip=0.0.0.0 \
--port=8889 \
--no-browser \
--allow-root" >> temp_command_file
#bash temp_command_file
export sgx_command="bash temp_command_file"
gramine-sgx bash 2>&1 | tee /ppml/jupyter-notebook.log
And you can try no-sgx first if encounter any questions.
@hzjane hi, I have some questions about running Jupyter inside SGX, Q1: Did your solution patch Jupyter Lab's source code , or just directly run it in the TEE without any modifications? Q2: If there was no any patch to Jupyter Lab, does it mean that each time the code is submitted from the Jupyter Web UI, a clone subprocess is triggered to run this code inside SGX for security purposes? Q3: A possible approach could be running the Web UI outside SGX and only running the Jupyter Lab kernel inside SGX. Have you considered this approach? Thank you !
@hzjane hi, I have some questions about running Jupyter inside SGX, Q1: Did your solution patch Jupyter Lab's source code , or just directly run it in the TEE without any modifications? Q2: If there was no any patch to Jupyter Lab, does it mean that each time the code is submitted from the Jupyter Web UI, a clone subprocess is triggered to run this code inside SGX for security purposes? Q3: A possible approach could be running the Web UI outside SGX and only running the Jupyter Lab kernel inside SGX. Have you considered this approach? Thank you !
hi. Q1: We didn't apply any patch to it. Q2: Maybe start a new kenel will call subprocess to run. Q3: No, We just tried this way that the webui and jupyter kernel both inside SGX.
Maybe you don't set
JUPYTER_RUNTIME_DIR
andJUPYTER_DATA_DIR
, and the 'usr/local/' should be/usr/local
. I try this command and successfully start jupyter service locally.cd /ppml export JUPYTER_RUNTIME_DIR=/ppml/jupyter/runtime export JUPYTER_DATA_DIR=/ppml/jupyter/data bash init.sh echo "export JUPYTER_RUNTIME_DIR=$JUPYTER_RUNTIME_DIR && \ export JUPYTER_DATA_DIR=$JUPYTER_DATA_DIR && \ /usr/local/bin/jupyter-lab notebook \ --notebook-dir=/ppml/apps \ --ip=0.0.0.0 \ --port=8889 \ --no-browser \ --allow-root" >> temp_command_file #bash temp_command_file export sgx_command="bash temp_command_file" gramine-sgx bash 2>&1 | tee /ppml/jupyter-notebook.log
And you can try no-sgx first if encounter any questions.
Thanks, I have successfully start jupyter service locally in the bigdl-ppml contianer. Now I try to run the jupyterlab in the official version of the Gramine 1.5 Docker image. Do I need to make additional modifications to gramine’s docker image?
Maybe you don't set
JUPYTER_RUNTIME_DIR
andJUPYTER_DATA_DIR
, and the 'usr/local/' should be/usr/local
. I try this command and successfully start jupyter service locally.cd /ppml export JUPYTER_RUNTIME_DIR=/ppml/jupyter/runtime export JUPYTER_DATA_DIR=/ppml/jupyter/data bash init.sh echo "export JUPYTER_RUNTIME_DIR=$JUPYTER_RUNTIME_DIR && \ export JUPYTER_DATA_DIR=$JUPYTER_DATA_DIR && \ /usr/local/bin/jupyter-lab notebook \ --notebook-dir=/ppml/apps \ --ip=0.0.0.0 \ --port=8889 \ --no-browser \ --allow-root" >> temp_command_file #bash temp_command_file export sgx_command="bash temp_command_file" gramine-sgx bash 2>&1 | tee /ppml/jupyter-notebook.log
And you can try no-sgx first if encounter any questions.
Thanks, I have successfully start jupyter service locally in the bigdl-ppml contianer. Now I try to run the jupyterlab in the official version of the Gramine 1.5 Docker image. Do I need to make additional modifications to gramine’s docker image?
PPML-image uses gramine-v1.3.1 as a base image, and i think there won't be many changes in the 1.5 version if EDMM is not enabled. Perhaps you should install jupyter and jupyterlab libraries , and just try it in the Gramine 1.5 Docker image.
PPML-image uses gramine-v1.3.1 as a base image, and i think is won't so many changes in the 1.5 version. Perhaps you should install jupyter and jupyterlab libraries , and just try it in the Gramine 1.5 Docker image. @hzjane So, your solution did not apply any patch to gramine, right ? It seems that gramine did not support netlink natively, we used to think that your image had custom patches for gramine to support netlink .
PPML-image uses gramine-v1.3.1 as a base image, and i think is won't so many changes in the 1.5 version. Perhaps you should install jupyter and jupyterlab libraries , and just try it in the Gramine 1.5 Docker image. @hzjane So, your solution did not apply any patch to gramine, right ? It seems that gramine did not support netlink natively, we used to think that your image had custom patches for gramine to support netlink .
We did patch Gramine to support netlink. https://github.com/gramineproject/gramine/compare/master...analytics-zoo:gramine:devel-v1.5.0-2023-07-19