bon icon indicating copy to clipboard operation
bon copied to clipboard

Support member groups: mutually exclusive, mutually required members

Open Veetaha opened this issue 1 year ago • 15 comments

Allow for groups and conflicts like in clap. A feature like this can be seen in const_typed_builder but according to crate's docs it requires significant compile time overhead.

The main requirement is for the user to allow to specify:

  • if this field was set, then some other field(s) must be set
  • if this field was set, then some other field(s) most not be set

Originally suggested by @dzmitry-lahoda in https://github.com/elastio/bon/issues/85#issuecomment-2336491289

A note for the community from the maintainers

Please vote on this issue by adding a 👍 reaction to help the maintainers with prioritizing it. You may add a comment describing your real use case related to this issue for us to better understand the problem domain.

Veetaha avatar Sep 08 '24 00:09 Veetaha

@dzmitry-lahoda I think I know what compile-time builder crate you are talking about which implements such a feature, and I was thinking about having it at some point in bon as well. This requires thorough design, because I'd like to avoid having the users of this feature do an unwrap, for example when they consume such groups of parameters. However, I understand that this may not be avoidable, and there will be some limit to the strictness of types.

For example, with clap all parameters in a group are just Option<T> types, and you have an implicit contract that clap validates the invariants of the group for you.

I was also thinking of having some lower-level API in bon to assist in constructing such complex type states. There is nothing material yet, but at some point this will be the next big feature priority in bon and then I'll think on the design more thorougly

Veetaha avatar Sep 08 '24 00:09 Veetaha

In the meantime I think the simplest way to achieve member groups is just by using enums. Like this:

#[derive(bon::Builder)]
struct Example {
    #[builder(into)]
    group: Group
}

#[derive(from_variants::FromVariants)]
enum Group {
    Variant1(Variant1),
    Variant2(Variant2),
}

#[derive(Builder)]
struct Variant1 { /**/ }

#[derive(Builder)]
struct Variant2 { /**/ }

let example = Example::builder()
    .group(Variant1::builder()/*set the members of the group*/.build())
    .build();

Note that this snippet is for a hypothetical use case that needs to have mutually exclusive groups of parameters. Your use case may probably be different. Maybe if you provided some real example (or real-ish if your use case is private), just to have specific problem example to start with? That'll help with understanding the problem domain better for me.


Btw. just a heads-up, I'm going to do a 2.2 release of bon on Sunday afternoon that will feature a new #[derive(bon::Builder)] syntax. It was added for various unobvious reasons that I'll describe in the blog post that I'll publish on Reddit

Veetaha avatar Sep 08 '24 00:09 Veetaha

here is my type for example (I have Market, Token types as well)

#[derive(Debug, Clone, PartialEq, Eq)]
#[cfg_attr(test, derive(proptest_derive::Arbitrary, Builder))]
pub struct PlaceOrder {
    #[cfg_attr(test, builder(default))] // #[builder(default)]
    pub session_id: SessionId,
    #[cfg_attr(test, builder(default))] // #[builder(default)]
    pub market_id: MarketId,
    pub side: Side,
    pub fill_mode: FillMode,
    #[cfg_attr(test, builder(default))] // #[builder(default)]
    pub is_reduce_only: bool,
    /// must be same decimals as `Price` if not `None` at current state
    pub price: Option<NonZeroPriceMantissa>,
    #[cfg_attr(test, proptest(filter = "|x| x.is_some()"))]
    pub size: Option<NonZeroOrderSize>,
    pub quote_size: Option<NonZeroU128>,
    /// optional user id which allows to place order on behalf of another user
    pub user_id: Option<UserId>,
    pub client_order_id: Option<u64>,
}

here is my manual builder(for Market and Token I do not have manual builder yet so):

pub fn order_for(market_id: MarketId) -> OrderTarget<NoSession, NoUser> {
    OrderTarget::<NoSession, NoUser> {
        market_id,
        session_id: NoSession,
        user_id: None,
        marker: Default::default(),
    }
}

/// Typestate safe concise builder for various orders.
///
/// # Implementation details
/// Unfortunately most builders are not really useful,
/// Could try to use some - but still likely not really useful,
/// even after proper PlaceOrder refactoring (so it would be good)
#[allow(dead_code)]
mod private {
    use crate::{
        FillMode, MarketId, NonZeroOrderSize, NonZeroPriceMantissa, PlaceOrder, SessionId, Side,
        UserId,
    };
    use core::marker::PhantomData;

    pub struct FillModeLimit;

    pub struct FillModeImmediate;
    pub struct FillModePostOnly;

    #[derive(Clone, Copy)]
    pub struct SessionSet(pub SessionId);

    #[derive(Clone, Copy, Default)]
    pub struct NoSession;
    pub struct NoUser;
    pub struct UserSet;

    #[derive(Clone, Copy)]
    pub struct OrderTarget<Session, User> {
        pub market_id: MarketId,
        pub session_id: Session,
        pub user_id: Option<UserId>,
        pub marker: PhantomData<(Session, User)>,
    }

    impl OrderTarget<NoSession, NoUser> {
        pub fn session(&self, session_id: SessionId) -> OrderTarget<SessionSet, NoUser> {
            OrderTarget {
                session_id: SessionSet(session_id),
                market_id: self.market_id,
                user_id: self.user_id,
                marker: Default::default(),
            }
        }

        pub fn user(&self, user_id: UserId) -> OrderTarget<NoSession, UserSet> {
            OrderTarget {
                user_id: Some(user_id),
                market_id: self.market_id,
                session_id: self.session_id,
                marker: Default::default(),
            }
        }
    }

    impl OrderTarget<SessionSet, NoUser> {
        pub fn user(&self, user_id: UserId) -> OrderTarget<SessionSet, UserSet> {
            OrderTarget {
                user_id: Some(user_id),
                market_id: self.market_id,
                session_id: self.session_id,
                marker: Default::default(),
            }
        }
    }

    #[derive(Clone, Copy, Default)]
    pub struct NoSide;
    pub struct AmountSet;
    pub struct NoAmount;
    #[derive(Clone, Copy)]
    pub struct SideSet(pub Side);

    pub struct NoPrice;

    pub struct PriceSet;

    impl<User> OrderTarget<SessionSet, User> {
        pub fn post_only(
            &self,
            side: Side,
        ) -> OrderBuilder<FillModePostOnly, SideSet, NoAmount, NoPrice> {
            OrderBuilder {
                fill_mode: FillMode::PostOnly,
                session_id: self.session_id.0,
                market_id: self.market_id,
                user_id: self.user_id,
                side: SideSet(side),
                marker: Default::default(),
                price: None,
                size: None,
            }
        }
        pub fn limit(&self, side: Side) -> OrderBuilder<FillModeLimit, SideSet, NoAmount, NoPrice> {
            OrderBuilder {
                fill_mode: FillMode::Limit,
                session_id: self.session_id.0,
                market_id: self.market_id,
                user_id: self.user_id,
                side: SideSet(side),
                marker: Default::default(),
                price: None,
                size: None,
            }
        }

        pub fn fill_or_kill(
            &self,
            side: Side,
        ) -> OrderBuilder<FillModeImmediate, SideSet, NoAmount, NoPrice> {
            OrderBuilder {
                fill_mode: FillMode::FillOrKill,
                session_id: self.session_id.0,
                market_id: self.market_id,
                user_id: self.user_id,
                side: SideSet(side),
                marker: Default::default(),
                price: None,
                size: None,
            }
        }

        pub fn immediate_or_cancel(
            &self,
            side: Side,
        ) -> OrderBuilder<FillModeImmediate, SideSet, NoAmount, NoPrice> {
            OrderBuilder {
                fill_mode: FillMode::ImmediateOrCancel,
                session_id: self.session_id.0,
                market_id: self.market_id,
                user_id: self.user_id,
                side: SideSet(side),
                marker: Default::default(),
                price: None,
                size: None,
            }
        }
    }

    impl OrderBuilder<FillModeImmediate, SideSet, NoAmount, NoPrice> {
        pub fn size(&self, size: NonZeroOrderSize) -> PlaceOrder {
            PlaceOrder {
                fill_mode: self.fill_mode,
                session_id: self.session_id,
                market_id: self.market_id,
                user_id: self.user_id,
                side: self.side.0,
                price: None,
                size: Some(size),
                is_reduce_only: false,
                quote_size: None,
                client_order_id: None,
            }
        }

        pub fn price(&self, price: NonZeroPriceMantissa) -> PlaceOrder {
            PlaceOrder {
                fill_mode: self.fill_mode,
                session_id: self.session_id,
                market_id: self.market_id,
                user_id: self.user_id,
                side: self.side.0,
                price: Some(price),
                size: None,
                is_reduce_only: false,
                quote_size: None,
                client_order_id: None,
            }
        }
    }

    impl OrderBuilder<FillModePostOnly, SideSet, NoAmount, NoPrice> {
        pub fn amount(&self, size: NonZeroOrderSize, price: NonZeroPriceMantissa) -> PlaceOrder {
            PlaceOrder {
                fill_mode: self.fill_mode,
                session_id: self.session_id,
                market_id: self.market_id,
                user_id: self.user_id,
                side: self.side.0,
                price: Some(price),
                size: Some(size),
                is_reduce_only: false,
                quote_size: None,
                client_order_id: None,
            }
        }
    }

    impl OrderTarget<NoSession, UserSet> {
        pub fn session(&self, session_id: SessionId) -> OrderTarget<SessionSet, UserSet> {
            OrderTarget::<SessionSet, UserSet> {
                session_id: SessionSet(session_id),
                user_id: self.user_id,
                market_id: self.market_id,
                marker: Default::default(),
            }
        }
    }

    impl OrderBuilder<FillModeLimit, NoSide, NoAmount, NoPrice> {
        pub fn side(&self, side: Side) -> OrderBuilder<FillModeLimit, SideSet, NoAmount, NoPrice> {
            OrderBuilder {
                fill_mode: self.fill_mode,
                session_id: self.session_id,
                market_id: self.market_id,
                user_id: self.user_id,
                side: SideSet(side),
                marker: Default::default(),
                price: None,
                size: None,
            }
        }
    }

    impl OrderBuilder<FillModeLimit, SideSet, NoAmount, NoPrice> {
        pub fn amount(
            &self,
            size: NonZeroOrderSize,
            price: NonZeroPriceMantissa,
        ) -> OrderBuilder<FillModeLimit, SideSet, AmountSet, PriceSet> {
            OrderBuilder {
                fill_mode: self.fill_mode,
                session_id: self.session_id,
                market_id: self.market_id,
                user_id: self.user_id,
                side: self.side,
                size: Some(size),
                price: Some(price),
                marker: Default::default(),
            }
        }
    }

    impl OrderBuilder<FillModeLimit, SideSet, AmountSet, PriceSet> {
        pub fn reduce_only(&self) -> PlaceOrder {
            PlaceOrder {
                fill_mode: self.fill_mode,
                session_id: self.session_id,
                market_id: self.market_id,
                user_id: self.user_id,
                side: self.side.0,
                price: self.price,
                size: self.size,
                is_reduce_only: true,
                quote_size: None,
                client_order_id: None,
            }
        }
    }

    impl From<OrderBuilder<FillModeLimit, SideSet, AmountSet, PriceSet>> for PlaceOrder {
        fn from(val: OrderBuilder<FillModeLimit, SideSet, AmountSet, PriceSet>) -> Self {
            PlaceOrder {
                fill_mode: val.fill_mode,
                session_id: val.session_id,
                market_id: val.market_id,
                user_id: val.user_id,
                side: val.side.0,
                price: val.price,
                size: val.size,
                is_reduce_only: false,
                quote_size: None,
                client_order_id: None,
            }
        }
    }

    #[derive(Clone, Copy)]
    pub struct OrderBuilder<FillModeType, SideType, AmountType, PriceType> {
        market_id: MarketId,
        side: SideType,
        session_id: SessionId,
        user_id: Option<UserId>,
        fill_mode: FillMode,
        marker: PhantomData<(FillModeType, SideType, AmountType, PriceType)>,
        price: Option<NonZeroPriceMantissa>,
        size: Option<NonZeroOrderSize>,
    }
}

impl PlaceOrder {
    pub fn validate(&self) -> Result<&Self, Error> {
        match self.fill_mode {
            FillMode::Limit | FillMode::PostOnly => {
                ensure!(self.price.is_some(), Error::OrderMissingLimits);
                ensure!(
                    self.size.is_some() || self.quote_size.is_some(),
                    Error::OrderMissingLimits
                );
            }
            FillMode::ImmediateOrCancel | FillMode::FillOrKill => {
                ensure!(
                    self.price.is_some() || self.size.is_some() || self.quote_size.is_some(),
                    Error::OrderMissingLimits,
                    "IoC and FoK - require at least one limit specified"
                );
            }
        }

        Ok(self)
    }
}

dzmitry-lahoda avatar Sep 08 '24 00:09 dzmitry-lahoda

@Veetaha enums are good, but

  • we have a lot of code which uses PlaceOrder. we also have in proto file defined PlaceOrder which almost exact PlaceOrder in Rust (one to one mapping, with some numeric validations done in pure Rust side). we could also have directly use prost or other serde on PlaceOrder. so we cannot change it easy to enum.
  • we have complicated enum architecture. for example, there are fields A B C D. A enables B but prevents C. C enables B and D. so we would have replicated shared parts of enum B in both C and A. such sharing may not be good for some types of serde. also to design proper enums from scratch is hard. and also it will require to use enum dispatch like in code. in our case FillMode enum and size_* fields are forming some kind of groups/constraints

so groups vs custom enums is tradeoff.

but sure - enums will help in some cases. for example we build like this one:

#[derive(Debug, Clone, PartialEq, Eq)]
#[cfg_attr(test, derive(proptest_derive::Arbitrary, bon::Builder))]
pub struct Action {
    pub timestamp: i64,
    #[cfg_attr(test, builder(default))]
    pub nonce: u32,
    pub kind: ActionKind,
}

#[derive(Debug, Clone, PartialEq, Eq)]
#[cfg_attr(test, derive(proptest_derive::Arbitrary))]
pub enum ActionKind {
    CreateSession(CreateSession),
    RevokeSession(RevokeSession),
    CreateToken(CreateToken),
    CreateMarket(CreateMarket),
    PlaceOrder(PlaceOrder),
   ...
}

dzmitry-lahoda avatar Sep 08 '24 00:09 dzmitry-lahoda

Yeah, I understand enums may be unreasonable when the combinations of states would contain a lot of duplicate fields and projections of each other, and the number of variants may also be high... Therefore this feature is still needed. I'm just thinking what people usually do to solve this ptoblem today with the tools currently available. I suppose it's one of:

  • Writing a complex builder with typestate manually
  • Trying to reshape the model to enums if it's reasonable
  • If no reasonable enums representations exist, just do a builder that validates everything at run-time

Veetaha avatar Sep 08 '24 01:09 Veetaha

  • Writing a complex builder with typestate manually
  • Trying to reshape the model to enums if it's reasonable
  • If no reasonable enums representations exist, just do a builder that validates everything at run-time

yes, yes, yes, and.

validation starts from binary data, like comparing some hash or crc and/or signature.

then goes to serde level, whatever serde provides (proto file defined is very limited), what types and conversion supported.

than goes more domain validation (enums goes here), so mapping raw data after serde and hash verification into more domain like objects. domain objects are also useful to build data which converted to serde in tests.

and than goes validation within some data/storage context, when it is required to have external data (in our case order validation requires market and tokens states from storage).

So in general there is not solution which fits, there is just ergonomics here and there all around. Bon takes some nice niche to fill somewhere around.

dzmitry-lahoda avatar Sep 08 '24 08:09 dzmitry-lahoda

here is builder with groups https://docs.rs/const_typed_builder/latest/const_typed_builder/ .

  • Writing a complex builder with typestate manually
  • Trying to reshape the model to enums if it's reasonable
  • If no reasonable enums representations exist, just do a builder that validates everythwwwing at run-time

Validation at run time can be as complex as SAT solver, for example interfield JSON validation. But when there is trust in communication like sender always provides valid data, less validation on receiver needed. In general sender can send proof that his data is according some validation criteria.

So from perspective of productivity and some form of automation of thinking, groups seems good feature bringing a lot of value to developers. No need to maintain typestate pattern manually (correct one is hard to write), no need to reshape what is hard to reshape to enums(or harmful, for example performance, data compression, versioning, SoA vs AoS - all can just say no to enums), no need to validate on receive side in trust boundaries. Just compiler runs more checks.

Ultimate answer to those who ask why builder is needed imho.

dzmitry-lahoda avatar Sep 09 '24 10:09 dzmitry-lahoda

@Veetaha any ideas on impl details?

I think of next:

  • limit number of groups to 8 maximum, may be less - no compiler bombs
  • have test ring to check various variants - how fast it solves, how many types generated, how fast is compile, etc
  • have formal description of feature first (docs first), so can align before experiments
  • compatibility with other not yet implemented features - what are these?

dzmitry-lahoda avatar Sep 09 '24 10:09 dzmitry-lahoda

From const-typed-builder docs:

this implementation currently has a complexity of $O(2^g)$ where $g$ is the amount of grouped variables

That sounds horrible. I'm afraid of how easy it will be to fall into that exponential pit and how much overhead it may bring to bon overall in terms of its simplicity and maintenance burden.

As I see such ideas appeared in typed-builder as well, but they never came to fruition:

  • https://github.com/idanarye/rust-typed-builder/issues/6
  • https://github.com/idanarye/rust-typed-builder/issues/14

At which point I start to question if it's even worth it. It's probably much simpler for people to do such group checks at runtime, for which we already have a mechanism - putting a #[builder] on a function or method.


I'd be glad to be proven wrong. Maybe there is some good subset of this feature that scales fine and solves a big chunk of the "member group" problem. For example, suppose that we just don't support at_least(N) clause. Maybe simple requires/conflicts_with clauses that specify just a single member. Will that allow us to avoid the exponential complexity?

Anyway, I think the way to proceed here:

  • Study existing issues and implementations from other crates (like the two issues in typed-builder that I mentioned and const-typed-builder itself)
  • Start very simple with a very limited subset of feature and try to prototype its implementation outside of the proc-macro world. Meaning, prototype the generated code itself such that it's obvious how to codify the logic behind and make sure such logic will scale (not exponentially).

Veetaha avatar Sep 09 '24 11:09 Veetaha

compatibility with other not yet implemented features - what are these?

You can see all the features planned in the issues in this repo. I don't think there are any of them in conflict with this feature yet.

Another thing that is currently only in my mind is some kind of partial stabilization of the type state API. For example, provide some set of traits for the users so they can add custom setters/getters (basically any methods) to the builder.

It's something along the lines of https://github.com/idanarye/rust-typed-builder/issues/23 and https://github.com/idanarye/rust-typed-builder/issues/22.

In fact, maybe such a feature would actually be a better replacement for the "groups" feature. It would require people to write a bit more manual code, but probably not as much as they'd do writing such type state from scratch.

Veetaha avatar Sep 09 '24 11:09 Veetaha

That sounds horrible.

2*8 is just 1024, no so big number. just limit number of groups and it becomes constant.

how much overhead it may bring to bon overall in terms of its simplicity and maintenance burden.

Currently bon is over engineered, I believe amount of code of macro can be 2x less with still using quote (problem is flat contexts and existing separation - it feels can do other split). Also it does not steam tokens, but allocates and concats pieces of heaps of token streams, while could do just direct streaming as code produced. Quote may also be less used for composability (will give speed and simplicity too imho).

But if to say if any interfield feature will give more complexity? Yes. Any, including manually expressing next for sure:

In fact, maybe such a feature would actually be a better replacement for the "groups" feature. It would require people to write a bit more manual code, but probably not as much as they'd do writing such type state from scratch.

groups of other kind? yeah it seems all depends what ""groups"" allow. Like if if you set this filed, must set other 2, but cannot preventing setting up other fields. that is group sub feature/projection.

there are various projection of """groups""" which

but they never came to fruition:

the person who did a lot in that area was not collaborative, he for some reason use codeberg to store code under some obscure licence. while he should have been directly PR into repo... also he was weird stating his algo is best or at all writing algo from scratch. there likely industrial grade solution to solve """groups""". i guess const-typed-builder more close to things.

It's probably much simpler for people to do such group checks at runtime,

Yes. But runtime is like dynamic typing or untyped code. All static typing of Rust can be done in assembly/C just by writing unit tests. But for some reason people thing Rust is better? Just because type system checks is close to place of declaration and instantiation, not some code else where which may be even unsound.

dzmitry-lahoda avatar Sep 09 '24 12:09 dzmitry-lahoda

@Veetaha

Maybe there is some good subset of this feature that scales fine and solves a big chunk of the "member group" problem.

yeah, there are projection, including which you mentioned manual approach with tuples. So for me tuples smell https://github.com/lloydmeta/frunk which is not idomatic Rust.

but projection searching which is does not grows so fast sure is thing.

like 2^g is bad. what if there is 2^(log g)? For 8 groups it is just 2^3 which basically is C. O(C). If g is limited all becomes C. So with log can limit to max 16g (ever nobody will want more), it is already just 2^4 variants. So exponential is kind of fine if it is log and hard limit.

Maybe simple requires/conflicts_with clauses that specify just a single member.

May be, singular requires/conflics allows for chains(hierarchical) of requires/conflicts which may be easy to solve.

And yeah, need to find what is most useful alternative to refactoring to enums

dzmitry-lahoda avatar Sep 09 '24 13:09 dzmitry-lahoda

2*8 is just 1024, no so big number. just limit number of groups and it becomes constant.

Note that the documentation in const-typed-builder mentions that the N in 2 ** N is the number of members in the group (not the number of groups).

Currently bon is over engineered, I believe amount of code of macro can be 2x less with still using quote (problem is flat contexts and existing separation - it feels can do other split). Also it does not steam tokens, but allocates and concats pieces of heaps of token streams, while could do just direct streaming as code produced. Quote may also be less used for composability (will give speed and simplicity too IMHO).

I don't understand how you'd decrease the amount of code by 2 times while preserving the existing features.

Regarding "flat contexts and existing separation - it feels can do other split", what do you mean by "flat context", and which separation do you think of? Isn't another split going to increase the amount of code (not decrease it)? The approach of using quote for composition is the go-to common approach that is used by basically every proc macro crate, and I don't see any problem with that. What would this "direct streaming" even look like?

Veetaha avatar Sep 09 '24 13:09 Veetaha

what do you mean by "flat context", and which separation do you think of?

I mean there is exists some kind of flat split of code(ctx, non orthogonal things), could merge all code together, shuffle it, may by using quote! less, and than unmerge back into more like graphy/orthogonal/composeble state which gives less code and more asier to compose in features.

That is what I did and others in ElastichSearch/SQL queries/builders/DSL if to code in F# style for example. quote! kind of also hinders some ability to compose.

What would this "direct streaming" even look like?

There are several concats of tokenstreams, and also I added one for getters. While could separate stages, one is just assemble one big graph, and than traverse it streaming into TokensStreams, never concat. Several stages.

I would not tell you how exactly it can be done, I just did it in various places similar things, it just seems visibly applicable here. I understand why people write proc macro in flat non streaming way, may be similar reason people write Go, not Rust.

UPDATE:

There is essence of builders and getsetters, which may be essences from code and than some well know combinations of code transformations summaries, in the end there will be graph, which when transverse just generates one single token stream.

dzmitry-lahoda avatar Sep 09 '24 13:09 dzmitry-lahoda

While could separate stages, one is just assemble one big graph, and than traverse it streaming into TokensStreams, never concat. Several stages.

Yeah, I understand that it's possible to add yet another layer of type models, whereby the code will construct this graph structure (like an execution plan in terraform) and then convert it all into a TokenStream in a single pass. The current BuilderGenCtx is that one structure but it is intentionally quite "flat", because that is good enough for the current state of things, and for the rest raw TokenStream works fine. If you need any more strict typing then syn::parse_quote!() can be used to operate with the syn's typed AST instead of quote!().

All-in-all any suggestion to perform a split will increase the code size and not decrease it because it adds more abstractions, and thus more type definitions, more functions and more code in general. So the ideas to "cut the code size by 2" and "do a split" contradict each other.

Keeping the number of types and models to the exact required minimum is what I think for the better and simpler to understand code

Veetaha avatar Sep 09 '24 13:09 Veetaha