Gabe Joseph

Results 117 issues of Gabe Joseph

_This is a duplicate of https://github.com/dask/dask/pull/8209, but I'm taking over from @fjetter since I may want to occasionally push small fixes here. Original message copied below, with edits._ This is...

dataframe

**Minimal Complete Verifiable Example**: ```python import dask.dataframe as dd df = dd.demo.make_timeseries() # Works with divisions df.x.rolling("50D").mean().compute() # timestamp # 2000-01-31 00:00:00 0.637020 # 2000-01-31 00:00:10 -0.179649 # 2000-01-31 00:00:20...

dataframe
needs attention
bug

Because `fuse_roots` materializes Blockwise layers, if you have a situation where you're sub-selecting a few items out of a large initial Blockwise array, culling the graph (cheap) before materializing it...

array

Perhaps a niche use-case, but sometimes I find myself SSHing into a box (say, a k8s node) that's running Python workloads in docker, but doesn't have pip installed on the...

The converse of https://github.com/benfred/py-spy/issues/229: I prefer speedscope format, so it would be nice to just be able to type `py-spy record -o profile.json` and not also have to add `-f...

[This file](https://gist.githubusercontent.com/gjoseph92/8485ce1b7eeaf7a6496c547133b27224/raw/c689e9c77ff51b8b74753f5b8faf3b0bd513e418/profile-no-negative-durs.json) in Chrome trace format looks very different in speedscope versus `chrome://tracing`: ![Screen Shot 2021-05-12 at 5 20 26 PM](https://user-images.githubusercontent.com/3309802/118056145-9f804b80-b346-11eb-9af2-04aac5ce2cc4.png) ![Screen Shot 2021-05-12 at 5 20 59 PM](https://user-images.githubusercontent.com/3309802/118056153-a1e2a580-b346-11eb-8861-d61c60bee05c.png) In...

With multi-threaded profiles, setting `#title=` turns every thread's name into the title, which makes them difficult to tell them apart. When there are multiple threads, I'd prefer if it kept...

When a worker breaks its connection to the scheduler, or goes too long without heartbeating, the scheduler removes it and reschedules its tasks to run elsewhere. However, other works may...

discussion
deadlock

In https://github.com/dask/distributed/issues/6110#issuecomment-1105837219, we found that workers were running themselves out of memory to the point where the machines became unresponsive. Because the memory limit in the Nanny is implemented [at...

stability
memory

It would be handy to be able to record multiple measures at once in `MemorySampler`. Particularly, recording `process` and `managed_spilled` at the same time gives you a picture of both...

enhancement
diagnostics