ppx
ppx copied to clipboard
MassIVE has changed the FTP server URLs
There is an issue when I want to download the file. I have watched that there is no delay between retries, but code is running perfectly 8 secs, which seems weird. It is more like connection interruption somehow. Is there any other way to solve this issue?
proj=ppx.find_project("MSV000090001", fetch=True, timeout=30)
proj._parser.max_reconnects = 50
proj.download("raw/RP_IDX_RAW/yeast_H2O_OTMS2_IDX_2_MminusHOnly.raw")
I increased retry number to 50, but it is not helpful
Hi @madina1203 - thanks for sharing your issue!
Can you post the error that you receive as well? Unfortunately, sometimes the repositories change their API suddenly and I suspect that may be the case here.
I was able to confirm that MassIVE now adds v01
to their FTP site URLs, which is causing the issue.
What used to be:
ftp://massive.ucsd.edu/MSV000090001
Is now:
ftp://massive.ucsd.edu/v01/MSV000090001
A patch is inbound!
This was the error:
Traceback (most recent call last): File "/Users/madinabekbergenova/Desktop/PhD/phd_ms_spectrum/main.py", line 5, in <module> downloaded_raw = proj.download("raw/RP_IDX_RAW/yeast_H2O_OTMS2_IDX_2_MminusHOnly.raw") File "/Users/madinabekbergenova/Desktop/PhD/phd_ms_spectrum/venv/lib/python3.10/site-packages/ppx/project.py", line 238, in download return self._parser.download( File "/Users/madinabekbergenova/Desktop/PhD/phd_ms_spectrum/venv/lib/python3.10/site-packages/ppx/ftp.py", line 262, in download self._download_file( File "/Users/madinabekbergenova/Desktop/PhD/phd_ms_spectrum/venv/lib/python3.10/site-packages/ppx/ftp.py", line 135, in _download_file self.connect() File "/Users/madinabekbergenova/Desktop/PhD/phd_ms_spectrum/venv/lib/python3.10/site-packages/ppx/ftp.py", line 84, in connect self._with_reconnects(self._connect) File "/Users/madinabekbergenova/Desktop/PhD/phd_ms_spectrum/venv/lib/python3.10/site-packages/ppx/ftp.py", line 113, in _with_reconnects raise error_temp( ftplib.error_temp: Failed after 50 reconnect(s), the last error was: 550 Failed to change directory.
Thank you for your reply.
Alas, it looks like this will more problematic than I originally thought. There are currently now 7 subdirectories in MassIVE (v0[1-7]
), which will take a bit of work to account for.
Hi @wfondrie, While you are working on fixing this bug, is there any way to tell ppx the correct URL and download files from MassIVE anyway? I tried to change the value of the property url in different ways, but it didn't work. Thank you very much!
Hi again @wfondrie, I was able to circumvent the bug by following S.B's second piece of advice here: https://stackoverflow.com/questions/76881788/how-do-i-replace-a-python-property-without-a-setter-for-an-instance. Basically, I changed the value of the property url so that it is the correct URL of the dataset I would like to download. Apparently, it's working since it has been downloading for more than 10 minutes now.
Thank you for linking the pull request! Is it already possible to use conda to install the ppx version that correctly retrieves the MassIVE FTP URLs?
The updated version is now available on PyPI, so you can install it with pip
. However, it should be updated on Bioconda within a few hours!
Thank you very much! And kudos on ppx since it greatly facilitates proteomic data download