Florian Jetter comments

Results 380 comments of


                                            Florian Jetter

[Python][Parquet] Parquet deserialization speeds slower on Linux

> I found that x.copy() ran in 2 GB/s and pq.read_table(io.BytesIO(bytes)) ran in 180 MB/s. I'm not sure if this comparison is actually fair and valid. Parquet -> Arrow has...

[Python][Parquet] Parquet deserialization speeds slower on Linux

Just a heads up. My current working theory is that the parquet deserialization performance is roughly where it is supposed to be (but honestly I don't know) but what we're...

Cleaning up old nightly conda packages

we're not using nightlies

Anaconda defaults channel

I assume that the dask/distributed use of defaults is coincidental. I wouldn't expect problems switching to nodefaults.

Worker plugin can not be registered on worker unless its entire package source uploaded on server

You should be able to define your module in cloudpickle to be pickled by value to force dask to upload this https://github.com/cloudpipe/cloudpickle?tab=readme-ov-file#overriding-pickles-serialization-mechanism-for-importable-constructs

Worker plugin can not be registered on worker unless its entire package source uploaded on server

This is not a shortcoming of the plugin system. You are faced with the exact same problem if you are submitting functions as ordinary tasks so this is a problem...

Unmanaged Memory Leak with Large Parquet Files (Dask + Distributed)

If your parquet files are 150MB on disk, chances are that they are easily 1GB in memory if not more and there are two threads per worker running loading this...

Update make_tls_certs.py, work with openssl 3 (#8701)

add to allowlist

Update make_tls_certs.py, work with openssl 3 (#8701)

Apologies for the CI failures. Our CI is indeed haunted by flaky tests. Changes proposed LGTM but maybe we'll wait for a bit to get a review on the CPython...

Update make_tls_certs.py, work with openssl 3 (#8701)

> If you care about keeping the tests running on ancient openssl, how ancient? (I think if CPython doesn't test more, I'm good with this) > Hmm, reviewing the PR,...