David Parks

Results 27 comments of David Parks

A few details: 1. Seek _does need_ to close the body even within the 10 MB range, that range is not a buffer that has been downloaded, it's an open...

Actually, while I'm at it, the `DEFAULT_BUFFER_SIZE` of 128k feels a bit arbitrary. That buffer doesn't reduce network overhead since the streaming body is staying open, it's just forcing a...

I doubt the fuse-based filesystem will be especially efficient for streaming and dealing with high-performance reads of large files, but I haven't really put it to the test, I vaguely...

I've finished a first pass at this optimization. We've got a clean set of unit tests (your existing unit tests were fantastic and a huge help in catching edge cases)....

An in-the-wild load test looks pretty good. I'm running two GPU based ML jobs performing small random reads over a 2TB dataset (3.5 GB data file sizes). Namespace bandwidth (top)...

> That looks good! What are the actual file types you’re reading, parquet? These are raw binary electrophysiology data, 25 khz recordings of electrical potential at 512 recording sites in...

I briefly considered being more predictive about this kind of pattern, but there wasn't a readily obvious way to do that. At this point it would be easy to implement...