RFC: Enforce `Point` and `Offset` invariants at the type level
Check for existing issues
- [X] Completed
Describe the feature
Offsets and points are used extensively throughout Zed. Each instance of these implicitly has some "space" they are associated with. For example, any given offset may be a MultiBuffer offset, a Buffer offset, a Display offset, and potentially others.
This has a few drawbacks:
- Not self-documenting
When first diving into the codebase, I'd see many functions asking for
offset: usizeorpoint: Point, and at first it was quite hard understanding which text space they were asking for. - Invariants aren't enforced by the compiler, leading to subtle bugs.
I ran into this working on my (WIP) Vim Arguments Text Object pull request. I'm using the syntax tree to select the correct argument node, and converted the syntax offsets to
DisplayPoints. The syntax offset is relative to the buffer, so this was a bug. These bugs can be subtle, as the normal use case is a singleton buffer. New contributors may not think to test in a MultiBuffer, or to test with inlay hints, folds, and other DisplayBuffer features. - Leads to code duplication
We have two versions of
ToOffsetandToPointandToPointUtf16. We also haveDisplayPoint,InlayPoint,InlayOffset,FoldPoint,WrapPoint, and offset variants of some of these. Many of these have duplicated implementations of traits such assum_tree::Dimension. - No single way to convert between spaces There's inconsistency in converting between different spaces, with methods scattered across types and spaces. Discovering the correct conversion method is non-trivial.
Proposed Solution
- Introduce a new
Offsettype, a newtype wrapper around the currentusizewe are using - Make
Point,Offset, andOffsetUtf16generic over the space they are associated with, allowing us to enforce space invariants at compile time. Instead ofusizeandPoint, we'd haveOffset<Buffer>,Point<Buffer>,Offset<MultiBuffer>,Point<MultiBuffer>, etc. Instead ofDisplayPointandInlayPoint, we'd usePoint<DisplayMap>andPoint<InlayMap>. - Make
ToOffsetandToPointgeneric over the space they are converting to. This gives us a single mechanism for converting between spaces.
// Replaces the `ToDisplayPoint` trait, this time enforcing the space invariants at compile time, and self-documenting
impl ToPoint<DisplayMap> for Point<MultiBuffer> {
type Context = DisplaySnapshot;
fn to_point(&self, context: &Self::Context) -> Point<DisplayMap> {
// ...
}
}
// Replaces `MultiBufferSnapshot::point_to_buffer_offset`
impl ToOffset<Buffer> for Point<MultiBuffer> {
type Context = MultiBufferSnapshot;
fn to_offset(&self, context: &Self::Context) -> Offset<Buffer> {
// ...
}
}
Drawbacks
- This is a relatively large refactor, although some of it should be possible to do incrementally. Luckily, this is where Rust shines!
- Slightly more complex. Seeing
offset: Offset<MultiBuffer>instead ofoffset: usizeis a bit more to take in
If applicable, add mockups / screenshots to help present your vision of the feature
No response
I think cleaning up this code is a good idea in general, and I really like the idea of making Point and DisplayPoint less confusing by having the type-name be self-descriptive. (I would also really like to add a Row<Buffer> vs Row<DisplayMap> distinction as that has bitten me several times, and unify on having a row/column field vs method...etc.).
For a significant refactor like this, we'd like you to spend some time pairing with someone at Zed to get started (so we can quickly answer the questions that come up as we get into implementation, and keep knowledge sharing high). I'd be happy to work with you on this for a bit: https://calendly.com/conradirwin/pairing, but you can also poke around in the Zed channels https://zed.dev/channel/zed-283 and find people to talk it through with.
I took a dive into this after #8870.
A few learnings:
- It is really convenient that the same units are used for the singleton buffer and the rope; so we probably want to use something like
Point<Rope>or some other default Point once we get that far down the stack. - There's a lot of code that is unclear about whether its
Point/usizerefers to the MultiBuffer or the Buffer.
So probably the place to start is to introduce a new abstraction at the MultiBuffer layer and use it for the maps that stack on top (leaving the existing rope::Point and usize for the Buffer for now).
- If we use something like
Point<MultiBuffer>,Row<MultiBuffer>andCol<MultiBuffer>(andOffset<MultiBuffer>) it all seems to work, though we can't rely on#[derive]because of the type parameter, so the implementations are a bit verbose. - We probably need more concrete helpers than just the
ToOffsettraits (for example to go from aPoint<MultiBuffer>to arope::Pointin an excerpt we subtract twoRow<MultiBuffer>s and add them to arope::Point).
I haven’t had much time to look at this, but played around with it a bit last weekend. I came to the same conclusion - rope and buffer need to have the same coordinate space.
I successfully got the rope crate to compile with a new Offset<Rooe> type, but had to stop there. Hopefully I’ll have more time to revisit in the next weekend or so.