paperspace-python
paperspace-python copied to clipboard
paperspace-python behind proxy
Does paperspace-python
need any additional settings to work behind a proxy?
It shouldn't be required. Are you seeing any issues in particular? If you can post some relevant code snippets we can help debug further :)
Thanks, @dte for getting back. Actually, I noticed the proxy issue with paperspace-node
, hence, asked the same for the python module. With paperspace-node
(on Ubuntu), if I just run:
paperspace machines availability --region "Europe (AMS1)" --machineType "P5000"
I get the following error:
{
"error": "getaddrinfo EAI_AGAIN api.paperspace.io:443"
}
Which I found upon googling a bit, indicates a proxy error. Once I switch to mobile data, it works fine.
Coming back to the main problem, I am just having a look around as to how to save files, etc. So, I just made a simple python script where I am copying data from one text file and saving it to another one in /storage
and moving it from /storage
to /artifacts
in the file run.sh
(below):
python test.py
mv /storage/* /artifacts
And when I run:
paperspace-python run --command "bash run.sh"
I get the following error:
Job Error: Error starting container: Error response from daemon: OCI runtime create failed:
container_linux.go:296: starting container process caused "process_linux.go:398: container
init caused \"process_linux.go:381: running prestart hook 1 caused \\\"error running hook:
exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods
configure --ldconfig=@/sbin/ldconfig.real --device=all --compute --utility --require=
cuda>=9.0 --pid=22549 var/lib/docker/overlay2/bfe1e5885d03a3df4a089c16f052cebb5746ef307
ebaec7d2f5c5d5667e14920/merged]\\\\nnvidia-container-cli: initialization error: cuda error:
no cuda-capable device is detected\\\\n\\\"\"": unknown
That's why I checked the machine availability, but it shows available=True
. Any reason why this might be happening?
Aman,
That machine lost access to its GPU temporarily. We were able to restart it and restore GPU access. I just ran a test and everything looks good currently, but we will continue to monitor it. Let us know if you see it again. Just send a note to [email protected] and we will escalate it. We may need to take that particular host out of service if if happens again.
Hi, @sanfilip. Oh, it's good to know that it is fixed. I ran a test now too, and it works fine. Just one problem though, due to that GPU problem, I have got many failed tests in my Job runner under Gradient, thereby exhausting my limit of 10 GPU jobs. Is there any way that can be reverted?
I'm going to forward your request to support. They should be able to credit you for the failures.
Thanks, @sanfilip. Really appreciate the support! Email for the account: [email protected]
I am now trying to run the actual training using paperspace-python
using the following command:
paperspace-python run --command "bash run.sh" --workspace autoencoder_train.zip --req ../requirements.txt --project "Splice Site Prediction" --name "AE train"
But after a while, I get back:
{
"error": true,
"message": "HTTPSConnectionPool(host='api.paperspace.io', port=443): Max retries
exceeded with url: /jobs/createJob?project=Splice+Site+Prediction&workspaceFileName=
autoencoder_train.zip.zip&container=paperspace%2Ftensorflow-python&machineType=P5000&
name=AE+train&command=pip2+install+-r+requirements.txt% 0Abash+run.sh (Caused by
ProxyError('Cannot connect to proxy.', error(\"(110, 'ETIMEDOUT')\",)))"
}
This is confusing since my test script worked, but this is showing a proxy error. Can you please help me with this?
ping @sanfilip @dte