fixed-map icon indicating copy to clipboard operation
fixed-map copied to clipboard

Use a bit set optimization for unit variant keys

Open udoprog opened this issue 1 year ago • 3 comments

When following key:

#[derive(Key)]
enum MyKey {
    First,
    Second,
    Third,
}

Is stored in a Set it is currently backed by a structure like this since it's based on a map storing () values:

struct Storage {
    data: [Option<()>; 3],
}

Which occupies 3 bytes. This could instead be optimized to make use of a bit set (like I do in bittle) who's size depends on the number of variants:

struct SetStorage {
    // can represent 8 distinct values.
    data: u8,
}

Performance Musings

Whether this would be "as good" as a manually implemented bitset is hard to say. It might require additional specialization such as ensuring that MyKey is #[repr(<integer>)] and that each variant specifies a distinct value, such as:

#[derive(Key)]
#[repr(u8)]
#[key(bitset = "u8")]
enum MyKey {
    First = 0b1000,
    Second = 0b0100,
    Third = 0b0010,
}

Without that it might be difficult to ensure that certain operations are optimal, to the same degree as a manual bitset:

let set = 0b1001;

// contains
if set & MyKey::First as u8 != 0 {
    // and so forth.
}

Iteration might be the hardest one to perform with good performance, but in all fairness that is also something which is difficult with a manual bitset. Ensuring some representation and that each variant is reliably specified might mean that it's possible to coerce values into MyKey (unsafely) which wouldn't be easy to do manually.

udoprog avatar Oct 26 '22 19:10 udoprog

This is a really interesting idea.

Rather than optimistically implementing Set with a bitset, I think it would be better to provide a separate type that uses bitsets. This way, there's no need for any extra attributes, and assuming there are some performance tradeoffs, the user can easily choose. That type would require that the key implements the BitKey trait, for which we would provide a separate derive macro.

At least as a starting point, require that all variants are manually assigned an integer. And then within the BitKey derive macro, enforce that bit assignments must be contiguous. There can not be any skipped bits, but they don't have to start at the first bit. This makes the iterator easy. And check that there are no bit conflicts.

pitaj avatar Oct 26 '22 19:10 pitaj

Another thing: it is technically possible to support non-unit variant keys (as long as they don't have any dynamic inners) by match-mapping the variants to integers.

#[derive(Clone, Copy, Key, BitKey)]
enum Key {
    One,
    Two(Option<()>),
    Four(Option<Option<bool>>),
}

impl BitKey for Key {
    fn contains(&self, key: Self) {
        let mask = match key {
            Key::One => 0b0000_0001,
            Key::Two(Some(_)) => 0b0000_0010,
            Key::Two(None) => 0b0000_0100,
            Key::Four(Some(Some(true))) => 0b0000_1000,
            Key::Four(Some(false))) => 0b0001_0000,
            Key::Four(Some(None)) => 0b0010_0000,
            Key::Four(None) => 0b0100_0000,
        };

        ...
    }
}

Obviously this couldn't be as optimal as the unit variant only case, but it is possible.

pitaj avatar Oct 26 '22 19:10 pitaj

It's an interesting proposition, all though I'm not sure if it's possible in the generic sense: since we don't have access to type information we'd have to guess that the type identifier Option means std::option::Option and so forth.

Or use an attribute #[key(option(bool))] or #[key(bool)] to have it try and perform the optimal code generation.

It's an interesting idea though. But I'd be more than happy to only have simple unit variants work first.

udoprog avatar Oct 26 '22 23:10 udoprog