Kevin German comments

Results 10 comments of


                                            Kevin German

file names should not use externally generated strings

The implementation for this feature should be pretty simple. If we change the the impl of `results.py#get_filename( ...)` to look something like ``` def get_filename(machine, commit_hash, env_name): """ Get the...

Add metadata to results

Its a hack, but in the current head (v0.5*) I have been assigning environment variables in the asv.conf.json and then grabbing those as params for the tests. In my case...

recursive serialization fails with nested objects

Simple reproduction: ``` from queue import Queue from distributed.protocol import core def nested_dicts( depth, breadth=1, nth=0 ): retval={} nth=breadth if nth0: for i in range(1,b+1): if i/n != round(i/n): curr[f"key{i}"]="bears"...

recursive serialization fails with nested objects

This problem occurs when we are running larger workloads on 8 or 16 node clusters. Although it still comes up without profiling enabled, enabling dask profiling reliably recreates the issue...

recursive serialization fails with nested objects

Example stack where it occurred: https://gist.github.com/VibhuJawa/d44e5c60c9b24aa357b0c9ec6f9306e4#file-worker-log-L407

recursive serialization fails with nested objects

It is worth noting that the exception is being thrown from msgpack because 512 is greater than the recursion limit expressed here : https://github.com/msgpack/msgpack-python/blob/main/msgpack/_packer.pyx#L50

recursive serialization fails with nested objects

Wonder if there is a workaround by adding a new serializer when we instantiate the client for the workload. http://distributed.dask.org/en/stable/serialization.html#extend

recursive serialization fails with nested objects

I was thinking about inserting a custom serializer when I create the client for tasks that I know will be expensive and/or when I want to enable profiling.

recursive serialization fails with nested objects

We are hitting this when we run GPU-BDB queries. They are intended to be representative of real world use cases.

WIP: log dask task stream and rmm events into results dir

I have another commit pending which fetches the logs into a single df which can then be serialized (parquet?) Still need an example analysis. Probably go back to a bokeh...