Feature request: PartitionedStream

Open Arlodotexe opened this issue 4 years ago • 0 comments

Motivation

Streams are an easy way to store/read a chunk of data from an arbitrary source. However, the only convenient option for storing several chunks of (closely related) data are:

Use multiple streams pointing to multiple resources (files, etc)
Manually load the entire resource into memory, separate the data into multiple streams, then pass it along.

Both of these options have obvious issues. Writing to multiple files can quickly make a mess of a file system, when you could have had a single file instead. But, using a single file means loading the entire thing into memory and manually splitting into several Streams, which is really bad for performance.

For the purpose of seamlessly reading and writing multiple chunks of data to a single dataset, I propose the ParitionedStream.

Proposal

Overview

A ParitionedStream allows for partitioning a given stream of data into multiple Streams, which can be passed around to standard APIs and operated on without loading the entire dataset into memory.

The API surface for this is open to discussion, but the base requirements are:

A PartitionedStream that takes a Stream in the constructor where partitions can be read/written.
A StreamPartition : Stream that is thread-safe and can be used as a normal Stream, but only interacts with a single partition.
Method on PartitionedStream that finds and returns all existing StreamPartitions.
Method on PartitionedStream that adds a partition and returns a new StreamPartition
Method to delete a partition
- Either directly on StreamPartition, or
- On the PartitionedStream. taking a StreamPartition as a parameter.

Technical challenges

Fast partition discovery

To discover partitions efficiently, it's vital that we have a partition map which can be read sequentially to avoid making multiple (potentially expensive) read operations just to discover partitions.

When deciding how to discover partitions, the obvious answer is to place a "map" at the start of the stream that we can read sequentially. However, since this map contains data about the partitions, when a new partition is added or an existing one is drastically changed, it will offset every single byte in all partitions.

This is expensive and something we want to avoid as much as possible, so I suggest placing the partition map at the end of the stream instead.

The length of the map is the very last byte, allowing the initial read of the map to be done in as few requests as possible.
The map can move as data is added or removed, but it's a much smaller amount of data to shift in memory.

Fast reads, fast writes

Stream allows for synchronous byte-per-byte reads/writes, which many APIs use. This introduces an interesting technical challenge.

To transparently facilitate byte-per-byte reads and writes without breaking functionality, concurrency, or sacrificing performance, we have two options:

Direct insert
- Directly insert new data to the correct partition
- Shifts all succeeding partitions on every write.
- Every read/write operation needs to handle shifting partition positions in real time.
- Significantly easier to implement than fragmented partitions.
Fragmented partitions
- Append new data to the end of the stream (whether single byte or a buffer of bytes)
- For finding fragments, either:
  - The start of each fragment is a pointer to the position of the next fragment (sentinel value if none)
  - Each fragment range is stored separately in the map
- Only fragmented in memory. Must be defragmented when flushed for fast sequential reads when read back later.
- Better for performance-critical reads/writes (data isn't being constantly shifted around)
- Until defragmented, uses slightly more memory to store pointers.

Mar 04 '22 18:03 Arlodotexe