bon icon indicating copy to clipboard operation
bon copied to clipboard

Is changing `String` to `Into<String>` correct?

Open rben01 opened this issue 1 year ago • 8 comments

When using #[builder] on a function, String arguments are turned into Into<String>. This feels wrong — String may have been an intentional choice by the author to, for instance, force allocations to be easily visible. Changing it to Into<String> defeats the purpose of that.

I think it's perfectly fine, and even correct, for callers to have to call .into() on their strings, just as if they had called the function using normal Rust syntax.

rben01 avatar Jul 30 '24 02:07 rben01

Thank you for bringing this up! This is definitely a controversial topic because of the confrontation between high-level non-performance-sensitive code and low-level performance-sensitive code. This confrontation influences the default behaviour of bon and its current choice of Into<T>.

Also, I'd like to make it clear from the beginning, that String is not a special case. Implicit Into cab be added to any setter based on a particular set of rules.

Advantages of implicit Into<String>:

  • It reduces the amount of boilerplate .into() or .to_owned() or .to_string() calls at the call site. The main use case for this is passing "string literals" directly to the setter.
  • Suppose you have an enum of two variants A(TypeA) and B(TypeB). This enum implements From<TypeA> and From<TypeB>. When such an enum is used in a setter with Into<Enum> it's possible to pass the value TypeA or TypeB directly without having to wrap them into an enum explicitly (which reduces boilerplate).
  • I often switch between &str and String types in my code due to various reasons. Having the setter accept any of them makes it easier to iterate on code and easier to change it.

Disadvantages of implicit Into<String>

  • Hides allocations from the reader. A potential footgun is that passing &String as Into<String> will also perform an implicit clone of that value.
  • Makes the signature a bit more complex to read
  • Makes the setter generic and thus complicates type inference in certain situations for the compiler. For example, if the the method accepts an Option<impl Into<T>> then you can't pass just None to such a setter, you'd need to specify the type T for the None literal, for example None::<T> for the compiler to know what concrete impl Into<T> was meant for the None literal.

We can not chose just one of the approaches

In high-level code people usually don't care about allocations that much. Ergonomics of the bon syntax is preferred more here. For example, when bon is used to define an API function that invokes a REST HTTP API call, then that has many string parameters, then it'd be much more convenient to fill all those string parameters without the need to call .into(), .to_owned() or .to_string() on &str literals.

In low-level code, the opposite is true. Every allocation matters and people need to see them with their eyes, because one such implicit allocation may be the performance bottleneck. It's fine to write more verbose code in this case in the sake of better performance.

Current status quo

I admit, bon was designed with some bias towards the high-level kind of code because I've been writing this kind of code at my work. In that context, I know from experience how many .into()/.to_owned() calls I and my team have been writing where performance doesn't actually matter.

However, I don't want bon to be focused only on high-level code. I'd like bon to be useful for low-level code as well, but still, the default behaviour needs to make some preference for only one of the classes of code (high-level or low-level) because there is just one default behaviour available.

Since I've already made that choice in the current 1.0 release of bon, we can take it for granted that the default behaviour of bon for Into conversions leans towards high-level code.

Now, what can we have to satisfy the needs of the low-level code? Today there is already a way to disable the implicit Into conversions with #[builder(into = false)] on the setter (see the reference for this attribute).

However, having to specify #[builder(into = false)] on every setter creates a lot of boilerplate too. I've been thinking of adding the ability to set #[builder(into = false)] at the top-level of the struct or function where the #[builder] attribute was placed to override the default behaviour of automatic Into for all setters at once.

Example of potential new #[builder] API:

use bon::builder;

// Disables automatic `Into` for all setters
#[builder(into = false)]
struct Example {
    foo: String,
    bar: String,

    // Overrides the top-level `into = false`
    // The setter **will** have an `Into` conversion
    #[builder(into)]
    baz: String,
}

Example::builder()
    // These setters accept `String`
    .foo("foo".to_owned())
    .bar("bar".into())
    // This setter accepts `impl Into<String>`
    .baz("doesn't need an explicit `into` or `to_owned()`")
    .build();

Do you think this solution would satisfy this requirement for low-level code and your requirements in particular?

Veetaha avatar Jul 30 '24 04:07 Veetaha

I just think that because rust already gives function authors the ability to make their arguments impl Into<T>, there is no need for bon to second guess their intent and turn places where they didn't write that into impl Into<T> as well, even if it is more ergonomic for callers. If bon is to do automatic impl Into<T> conversion — and I'm not even sure it should — I definitely think it should not be the default. It might even belong in a separate crate, though.

rben01 avatar Jul 30 '24 04:07 rben01

even if it is more ergonomic for callers might even belong in a separate crate, though.

This is the main thing. The bon crate is this exact crate that does this. One of the core pillars and goals of bon is "Great ergonomics" and "Zero boilerplate" (they are mentioned on the homepage). Unfortunately, removing automatic Into conversion will go against this. I can only propose a solution with the configuration to explicitly disable this behaviour as mentioned at the end of my comment https://github.com/elastio/bon/issues/15#issuecomment-2257437916.

Also, note that Into<String> doesn't actually imply runtime overhead and worse performance per se. impl From<T> for T is a zero overhead identity function, so if you pass String to the setter that accepts impl Into<String> then it will do no additional allocations.

Veetaha avatar Jul 30 '24 09:07 Veetaha

Perhaps you could enable a 'no into' feature in the cargo.toml which would disable this throughout the crate? This might be good because people who don't want the impl Into functionality probably won't want it for anything in the crate anyway

ThomasAlban avatar Aug 08 '24 08:08 ThomasAlban

Unfortunatelly, cargo features don't work that way. They must be additive because they undergo feature unification between all the crates in the dependency tree. There can't be one crate that uses bon with into feature disabled and other crate that uses into feature enabled and they see entirely different function signatures. All crates in the dependency tree must be compatible with a single set of features of bon. It means bon with into feature enabled must be fully compatible with bon without such feature enabled. It will be kinda compatible in most cases, but impl Into in the function signature influences type inference, which may lead to compile errors for code that was written against bon without the into feature.

Veetaha avatar Aug 08 '24 11:08 Veetaha

Another possibility. bon may expose a separate version of the builder macro under a different module name. For example

use bon::explicit::builder;

It would expose an "explicit" version of the builder macro that doesn't have automatic into conversions. This way the syntax will be simpler for the use case where automatic into conversions aren't desired. You just need to import builder from the explicit module and you won't need to set #[builder(into = false)] on each macro usage.

The naming is debatable though.

However, I'd like to be sure that this is the thing that will be used a lot, such that it really merits such a separate symbol in bon crate.

Veetaha avatar Aug 08 '24 11:08 Veetaha

I'm revisiting this issue again. The problem with impl Into that's been biting me lately is the weakened type inference. For example, if there is a member of type Config and you try to do .member(serde_json::from_str(str)?), then you have to explicitly specify the type hint serde_json::from_str::<Config>(str)? because the parameter's type is impl Into<Config>, which could be any type.

Another downside of this - the complexity of the signature. It becomes harder to read it, and this indeed's been biting me as well. Also having to learn the rules of auto-into conversions is another aspect of bon that makes it harder to adopt. It's quite easy to shoot yourself in the foot with this.

Given your feedback as well, I'm more inclined to disable automatic Into conversions by default. Maybe it's better to do a 2.0 update earlier than later while there isn't too much code using bon yet.

Veetaha avatar Aug 12 '24 15:08 Veetaha

Here are more thoughts on the design for this. The default #[builder] macro should generate no automatic impl Into on setters. The automatic into conversion rules will be removed.

So what will we have instead? A new attribute #[builder(on(type_pattern, attrs))]. It lets you explicitly define your own rules for Into conversions by specifying the type_pattern (e.g. String, Vec<_>, where _ is used as a wildcard), and specify which attrs you'd like to apply to members of this type. This makes it extensible so that new attributes can be configured in the future this way.

The attribute must be placed at the top level. Example:

#[builder(on(String, into))]
struct Example {
    // Automatically gets #[builder(into)]
    id: String,
    
    // Automatically gets #[builder(into)]
    // Even though the pattern is just `String`, it still matches `Option<String>`,
    // because `Option` is used as a special marker for optional fields. It also
    // makes this consistent with changing `Option<String>` to 
    // `#[builder(default)]` on a String
    description: Option<String>,

    // Isn't of type `String`, so no into conversion
    score: u32,
}

You may also specify on(..) several times to configure into conversions for several types e.g. #[builder(on(String, into), on(Vec<_>, into))].

This is how I'd like to approach this. I don't want the on(..) attribute to be some complex DSL, so I'll start simple like this. I'll see how people use/like it and work from there.

Veetaha avatar Aug 19 '24 18:08 Veetaha

The PR https://github.com/elastio/bon/pull/54 that implements the behavior described in https://github.com/elastio/bon/issues/15#issuecomment-2297202745 was merged into master. I'll get it released with #56 implemented as well ASAP

Veetaha avatar Aug 26 '24 13:08 Veetaha