John Emhoff

Results 12 comments of John Emhoff

Yep, actually that's great -- thanks!

I'm using google's platform, but I think this will be an issue for any of the backends due to the API that cloudstorage exposes, if I'm not mistaken.

I managed to trim down the script a good bit -- it turns out writing data is unnecessary, the leaks happen just creating writers: https://gist.github.com/JohnEmhoff/55f562c2de701dfb426643a3e7751ef8

Thanks for looking into it. I think you're right -- I noticed that when my spec in the script above is just a column or two, it leaks much, much...

We were able to accomplish something similar (blacklisting) by cloning the crates.io index, filtering out what we didn't want, and then pointing panamax to that filtered index when syncing. The...

Personally, I'm continually bitten by the caching behavior and just turn it off completely. In any kind of distributed system there will quite often be updates to GCS that are...

I'm getting the exact same behavior in a cluster of worker processes and I'm having a hard time reproducing. It looks something like this: 1. Master machine creates file at...

Sure, I'll give 0.4.0 a try. On that note, would you consider pinning dependencies in releases, especially fsspec? I could be wrong but this feels like a regression -- we're...

I assume it's an issue between gcsfs and fsspec because this started happening out of the blue even though we were still using the same gcsfs version. We've been bitten...

I'm of the opinion that caching should be done at the application level rather in the library, although I know there are a lot of different use cases out there.