papermill
papermill copied to clipboard
Getting stuck when executing Papermill with sparkmagic kernel
🐛 Bug
I am running the following command with sparkmagic kernel (https://github.com/jupyter-incubator/sparkmagic)
/opt/conda/bin/papermill ./input.ipynb ./output.ipynb --log-output --log-level DEBUG --progress-bar --autosave-cell-every 5
This kernel connects to an EMR Cluster and executes sql query. Normally things work, but it gets stuck whenever we use %%sql magic. This issue is observed only when running with papermill, does not happen when running interactively.
%%sql
show databases
The kernel returns the output but papermill does not End the current cell. autosave option captures the output but the metadata shows the cell has still running
duration: null
end_time: null
exception: false
start_time: "2023-03-16T23:41:30.662269"
status: "running"
Running out of ideas to debug this issue, can someone help understand why papermill does not end the cell execution ?
debug message right before getting stuck:
msg_type: display_data
content: {'data': {'text/plain': '<IPython.core.display.HTML object>', 'text/html': '<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border="1" class="dataframe hideme">\n <thead>\n <tr style="text-align: right;">\n <th></th>\n <th>namespace</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>default</td>\n </tr>\n </tbody>\n</table>\n</div>'}, 'metadata': {}, 'transient': {}}
msg_type: comm_msg
content: {'data': {'method': 'update', 'state': {'msg_id': ''}, 'buffer_paths': []}, 'comm_id': 'df913063ede044d29a76d4febf6d372e'}
msg_type: status
content: {'execution_state': 'idle'}