Kevin German
Kevin German
The implementation for this feature should be pretty simple. If we change the the impl of `results.py#get_filename( ...)` to look something like ``` def get_filename(machine, commit_hash, env_name): """ Get the...
Its a hack, but in the current head (v0.5*) I have been assigning environment variables in the asv.conf.json and then grabbing those as params for the tests. In my case...
Simple reproduction: ``` from queue import Queue from distributed.protocol import core def nested_dicts( depth, breadth=1, nth=0 ): retval={} nth=breadth if nth0: for i in range(1,b+1): if i/n != round(i/n): curr[f"key{i}"]="bears"...
This problem occurs when we are running larger workloads on 8 or 16 node clusters. Although it still comes up without profiling enabled, enabling dask profiling reliably recreates the issue...
Example stack where it occurred: https://gist.github.com/VibhuJawa/d44e5c60c9b24aa357b0c9ec6f9306e4#file-worker-log-L407
It is worth noting that the exception is being thrown from msgpack because 512 is greater than the recursion limit expressed here : https://github.com/msgpack/msgpack-python/blob/main/msgpack/_packer.pyx#L50
Wonder if there is a workaround by adding a new serializer when we instantiate the client for the workload. http://distributed.dask.org/en/stable/serialization.html#extend
I was thinking about inserting a custom serializer when I create the client for tasks that I know will be expensive and/or when I want to enable profiling.
We are hitting this when we run GPU-BDB queries. They are intended to be representative of real world use cases.
I have another commit pending which fetches the logs into a single df which can then be serialized (parquet?) Still need an example analysis. Probably go back to a bokeh...