sparkmagic
sparkmagic copied to clipboard
[BUG] Running sparkmagic notebook in sagemaker lifecycle script
Describe the bug Through sagemaker notebooks I am trying to run a sparkmagic notebook (to talk to an emr) via nbconvert inside of the lifecycle script that runs during start up. It looks like it isn't picking up the config file.
If I wait till after sagemaker has started it all connects and works fine.
I know this sounds like a sagemaker issue, but aws aren't being any help so was hoping someone has an idea here.
To Reproduce Run a python script in sagemaker lifecycle that uses nbconvert to run a sparkmagic notebook that tries to talk to a emr cluster
Hi @byteford thanks for making an issue and sorry you are running into issues. A few follow up questions
- What version of sparkmagic are you using?
- Can you share the Python script? If you want to execute a sparkmagic notebook, the preferred method is to use papermill
- Does it work if you use
%%configure
?
Hey, Thanks for the reply, We are using sparkmagic 0.19.1 here is the file I use, I have tried it in a minimum amount of code and it still doesn't work but I don't have the code to had. sparkrun.txt
Is there a link to how to use %%configure to set up the IP address as cant seam to find it?
I don't have documentation on configuring an IP address with %%configure
but it mimics the structure of the config exactly, so it should be the same as what you have. If you share an config file example I can try help.
Also, at a glance this script is doing exactly what papermill is design and tested for. Can you try use papermill and see if you still run into the issue?
If you still see the issue with papermill, then it's something to do with the environment of the lifecycle script. I'm not familiar enough with SageMaker to give more specific advice.
That you for the information, the %%configure
still had the same issue. I will try with papermill and hope it works :)