ppx icon indicating copy to clipboard operation
ppx copied to clipboard

MassIVE has changed the FTP server URLs

Open madina1203 opened this issue 1 year ago • 7 comments

There is an issue when I want to download the file. I have watched that there is no delay between retries, but code is running perfectly 8 secs, which seems weird. It is more like connection interruption somehow. Is there any other way to solve this issue?

proj=ppx.find_project("MSV000090001", fetch=True, timeout=30)
proj._parser.max_reconnects = 50
proj.download("raw/RP_IDX_RAW/yeast_H2O_OTMS2_IDX_2_MminusHOnly.raw")

madina1203 avatar Jan 09 '24 20:01 madina1203

I increased retry number to 50, but it is not helpful

madina1203 avatar Jan 09 '24 20:01 madina1203

Hi @madina1203 - thanks for sharing your issue!

Can you post the error that you receive as well? Unfortunately, sometimes the repositories change their API suddenly and I suspect that may be the case here.

wfondrie avatar Jan 09 '24 21:01 wfondrie

I was able to confirm that MassIVE now adds v01 to their FTP site URLs, which is causing the issue.

What used to be:

ftp://massive.ucsd.edu/MSV000090001

Is now:

ftp://massive.ucsd.edu/v01/MSV000090001

A patch is inbound!

wfondrie avatar Jan 09 '24 23:01 wfondrie

This was the error: Traceback (most recent call last): File "/Users/madinabekbergenova/Desktop/PhD/phd_ms_spectrum/main.py", line 5, in <module> downloaded_raw = proj.download("raw/RP_IDX_RAW/yeast_H2O_OTMS2_IDX_2_MminusHOnly.raw") File "/Users/madinabekbergenova/Desktop/PhD/phd_ms_spectrum/venv/lib/python3.10/site-packages/ppx/project.py", line 238, in download return self._parser.download( File "/Users/madinabekbergenova/Desktop/PhD/phd_ms_spectrum/venv/lib/python3.10/site-packages/ppx/ftp.py", line 262, in download self._download_file( File "/Users/madinabekbergenova/Desktop/PhD/phd_ms_spectrum/venv/lib/python3.10/site-packages/ppx/ftp.py", line 135, in _download_file self.connect() File "/Users/madinabekbergenova/Desktop/PhD/phd_ms_spectrum/venv/lib/python3.10/site-packages/ppx/ftp.py", line 84, in connect self._with_reconnects(self._connect) File "/Users/madinabekbergenova/Desktop/PhD/phd_ms_spectrum/venv/lib/python3.10/site-packages/ppx/ftp.py", line 113, in _with_reconnects raise error_temp( ftplib.error_temp: Failed after 50 reconnect(s), the last error was: 550 Failed to change directory. Thank you for your reply.

madina1203 avatar Jan 10 '24 09:01 madina1203

Alas, it looks like this will more problematic than I originally thought. There are currently now 7 subdirectories in MassIVE (v0[1-7]), which will take a bit of work to account for.

wfondrie avatar Mar 06 '24 23:03 wfondrie

Hi @wfondrie, While you are working on fixing this bug, is there any way to tell ppx the correct URL and download files from MassIVE anyway? I tried to change the value of the property url in different ways, but it didn't work. Thank you very much!

RobAlbn avatar Mar 27 '24 15:03 RobAlbn

Hi again @wfondrie, I was able to circumvent the bug by following S.B's second piece of advice here: https://stackoverflow.com/questions/76881788/how-do-i-replace-a-python-property-without-a-setter-for-an-instance. Basically, I changed the value of the property url so that it is the correct URL of the dataset I would like to download. Apparently, it's working since it has been downloading for more than 10 minutes now.

RobAlbn avatar Mar 28 '24 12:03 RobAlbn

Thank you for linking the pull request! Is it already possible to use conda to install the ppx version that correctly retrieves the MassIVE FTP URLs?

RobAlbn avatar Apr 16 '24 13:04 RobAlbn

The updated version is now available on PyPI, so you can install it with pip. However, it should be updated on Bioconda within a few hours!

wfondrie avatar Apr 16 '24 17:04 wfondrie

Thank you very much! And kudos on ppx since it greatly facilitates proteomic data download

RobAlbn avatar Apr 16 '24 17:04 RobAlbn