ssgetpy
ssgetpy copied to clipboard
Slow download speed
Thank you for this project! It's much easier to get matrices now. The only issue I found so far is that it takes days to download big matrices. I've changed the chunk size along with the sleep duration so now it seems to be as fast as downloading with the browser. Here is my change:
with open(localdest, "wb") as outfile, tqdm(
total=content_length, desc=self.name, unit="B"
) as pbar:
for chunk in response.iter_content(chunk_size=131072):
outfile.write(chunk)
pbar.update(131072)
time.sleep(0.01)
I'm not sure about exact numbers, but it most probably should be higher than in the trunk. Is there any reason to keep the chunk size that small?
The 0.1 and (in your change) 131072 basically means it downloads a 128KiB chunk ever 0.1s, or ~1.3 MB/s. @drdarshan may have asked the suitsparse people how hard they want this package to hit their hosting, or @drdarshan may have just picked a "safe" conservative speed.
@cwpearson, @senior-zero - I apologize for not seeing this issue sooner. Yes, I picked a safe download speed but will fix this right away. Thank you for reporting this!
A possible (unlimited) fix can be found in PR #3
One more comment: I would guess that the suitesparse people are fine with unlimited download speeds as they allow it via a) browser and b) the "regular" ssget CLI tool