paperspace-python icon indicating copy to clipboard operation
paperspace-python copied to clipboard

paperspace-python behind proxy

Open dalmia opened this issue 6 years ago • 8 comments

Does paperspace-python need any additional settings to work behind a proxy?

dalmia avatar Apr 11 '18 10:04 dalmia

It shouldn't be required. Are you seeing any issues in particular? If you can post some relevant code snippets we can help debug further :)

dte avatar Apr 11 '18 18:04 dte

Thanks, @dte for getting back. Actually, I noticed the proxy issue with paperspace-node, hence, asked the same for the python module. With paperspace-node (on Ubuntu), if I just run:

paperspace machines availability --region "Europe (AMS1)" --machineType "P5000"

I get the following error:

{
  "error": "getaddrinfo EAI_AGAIN api.paperspace.io:443"
}

Which I found upon googling a bit, indicates a proxy error. Once I switch to mobile data, it works fine.

Coming back to the main problem, I am just having a look around as to how to save files, etc. So, I just made a simple python script where I am copying data from one text file and saving it to another one in /storage and moving it from /storage to /artifacts in the file run.sh (below):

python test.py
mv /storage/* /artifacts

And when I run:

paperspace-python run --command "bash run.sh"

I get the following error:

Job Error: Error starting container: Error response from daemon: OCI runtime create failed: 
container_linux.go:296: starting container process caused "process_linux.go:398: container
 init caused \"process_linux.go:381: running prestart hook 1 caused \\\"error running hook: 
exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods 
configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --utility --require=
cuda>=9.0 --pid=22549 var/lib/docker/overlay2/bfe1e5885d03a3df4a089c16f052cebb5746ef307
ebaec7d2f5c5d5667e14920/merged]\\\\nnvidia-container-cli: initialization error: cuda error: 
no cuda-capable device is detected\\\\n\\\"\"": unknown

That's why I checked the machine availability, but it shows available=True. Any reason why this might be happening?

dalmia avatar Apr 11 '18 20:04 dalmia

Aman,

That machine lost access to its GPU temporarily. We were able to restart it and restore GPU access. I just ran a test and everything looks good currently, but we will continue to monitor it. Let us know if you see it again. Just send a note to [email protected] and we will escalate it. We may need to take that particular host out of service if if happens again.

sanfilip avatar Apr 11 '18 21:04 sanfilip

Hi, @sanfilip. Oh, it's good to know that it is fixed. I ran a test now too, and it works fine. Just one problem though, due to that GPU problem, I have got many failed tests in my Job runner under Gradient, thereby exhausting my limit of 10 GPU jobs. Is there any way that can be reverted?

dalmia avatar Apr 11 '18 21:04 dalmia

I'm going to forward your request to support. They should be able to credit you for the failures.

sanfilip avatar Apr 11 '18 21:04 sanfilip

Thanks, @sanfilip. Really appreciate the support! Email for the account: [email protected]

dalmia avatar Apr 11 '18 22:04 dalmia

I am now trying to run the actual training using paperspace-python using the following command:

paperspace-python run --command "bash run.sh" --workspace autoencoder_train.zip  --req ../requirements.txt --project "Splice Site Prediction" --name "AE train"

But after a while, I get back:

{           
  "error": true, 
  "message": "HTTPSConnectionPool(host='api.paperspace.io', port=443): Max retries
   exceeded with url: /jobs/createJob?project=Splice+Site+Prediction&workspaceFileName= 
   autoencoder_train.zip.zip&container=paperspace%2Ftensorflow-python&machineType=P5000& 
   name=AE+train&command=pip2+install+-r+requirements.txt% 0Abash+run.sh (Caused by
   ProxyError('Cannot connect to proxy.', error(\"(110, 'ETIMEDOUT')\",)))"
}

This is confusing since my test script worked, but this is showing a proxy error. Can you please help me with this?

dalmia avatar Apr 12 '18 11:04 dalmia

ping @sanfilip @dte

dalmia avatar Apr 15 '18 19:04 dalmia