David Parks comments

Results 27 comments of


                                            David Parks

Ranged read http header leaves TO bytes blank, causing S3 filesystems to read full file length

A few details: 1. Seek _does need_ to close the body even within the 10 MB range, that range is not a buffer that has been downloaded, it's an open...

Ranged read http header leaves TO bytes blank, causing S3 filesystems to read full file length

Actually, while I'm at it, the `DEFAULT_BUFFER_SIZE` of 128k feels a bit arbitrary. That buffer doesn't reduce network overhead since the streaming body is staying open, it's just forcing a...

Ranged read http header leaves TO bytes blank, causing S3 filesystems to read full file length

I doubt the fuse-based filesystem will be especially efficient for streaming and dealing with high-performance reads of large files, but I haven't really put it to the test, I vaguely...

Ranged read http header leaves TO bytes blank, causing S3 filesystems to read full file length

I've finished a first pass at this optimization. We've got a clean set of unit tests (your existing unit tests were fantastic and a huge help in catching edge cases)....

Ranged read http header leaves TO bytes blank, causing S3 filesystems to read full file length

An in-the-wild load test looks pretty good. I'm running two GPU based ML jobs performing small random reads over a 2TB dataset (3.5 GB data file sizes). Namespace bandwidth (top)...

Ranged read http header leaves TO bytes blank, causing S3 filesystems to read full file length

> That looks good! What are the actual file types you’re reading, parquet? These are raw binary electrophysiology data, 25 khz recordings of electrical potential at 512 recording sites in...

Ranged read http header leaves TO bytes blank, causing S3 filesystems to read full file length

I briefly considered being more predictive about this kind of pattern, but there wasn't a readily obvious way to do that. At this point it would be easy to implement...