smithy4s icon indicating copy to clipboard operation
smithy4s copied to clipboard

Validated newtypes

Open denisrosca opened this issue 1 year ago • 13 comments

Add an option to render constrained newtypes over Smithy primitives as "validated" newtypes.

Closes #966

  • [x] Add validated newtypes
  • [x] Make validated newtypes rendering opt-in via metadata
  • [x] Handle primitives
  • [ ] user defined refinements?
  • [ ] collections?

PR Checklist (not all items are relevant to all PRs)

  • [ ] Added unit-tests (for runtime code)
  • [x] Added bootstrapped code + smoke tests (when the rendering logic is modified)
  • [x] Added build-plugins integration tests (when reflection loading is required at codegen-time)
  • [ ] Added alloy compliance tests (when simpleRestJson protocol behaviour is expanded/updated)
  • [ ] Updated dynamic module to match generated-code behaviour
  • [x] Added documentation
  • [ ] Updated changelog

denisrosca avatar Mar 19 '24 12:03 denisrosca

Hey @denisrosca , thanks for looking into this.

However : validated newtypes should not be rendered by default, for the simple reason that it'd break a lot of user code and that's not really acceptable now that smithy4s has received a reasonable amount of adoption. So it should be an opt-in, controlled by means of smithy metadata (with a key prefixed withsmithy4s)

Baccata avatar Mar 19 '24 13:03 Baccata

Hey @denisrosca , thanks for looking into this.

However : validated newtypes should not be rendered by default, for the simple reason that it'd break a lot of user code and that's not really acceptable now that smithy4s has received a reasonable amount of adoption. So it should be an opt-in, controlled by means of smithy metadata (with a key prefixed withsmithy4s)

Working on that 👍 Just wanted to create the PR as early as possible to get some 👀 on the newtype encoding.

denisrosca avatar Mar 19 '24 13:03 denisrosca

@Baccata @kubukoz I've hit a bit of a problem after adding the metadata conditional.

The main issue is that now the same smithy source file can be rendered in different ways depending if the metadata flag is set to true or false.

Consider this smithy definition:

namespace newtypes.validated

use smithy4s.meta#unwrap
use alloy#simpleRestJson

@length(min: 1, max: 10)
string ValidatedCity

@length(min: 1, max: 10)
string ValidatedName

@unwrap
@length(min: 1, max: 10)
string ValidatedCountry

structure Person {
  @httpLabel
  @required
  name: ValidatedName

  @httpQuery("town")
  town: ValidatedCity

  @httpQuery("country")
  country: ValidatedCountry
}

which gets rendered to:

final case class Person(name: ValidatedName, town: Option[ValidatedCity] = None, country: Option[String] = None)

object Person extends ShapeTag.Companion[Person] {
  val id: ShapeId = ShapeId("newtypes.validated", "Person")

  val hints: Hints = Hints.empty

  implicit val schema: Schema[Person] = struct(
    ValidatedName.schema.required[Person]("name", _.name).addHints(smithy.api.HttpLabel()),
    ValidatedCity.schema.optional[Person]("town", _.town).addHints(smithy.api.HttpQuery.unsafeApply("town")),
    ValidatedCountry.underlyingSchema.optional[Person]("country", _.country).addHints(smithy.api.HttpQuery.unsafeApply("country")),
  ){
    Person.apply
  }.withId(id).addHints(hints)
}

This fails to compile because smithy.api#httpQuery is defined like this:

@length(min: 1)
string httpQuery

which when converted from smithy to IR is interpreted as a validated newtype. As a result the hint in Person.schema is rendered via a call to smithy.api.HttpQuery.unsafeApply which leads to a compilation error:

[info] [error] /private/var/folders/_z/c52c1q4n0fj6617nkht2pgcr0000gn/T/sbt_59b16f00/target/scala-2.13/src_managed/main/scala/newtypes/validated/Person.scala:18:89: value unsafeApply is not a member of object smithy.api.HttpQuery
[info] [error]     ValidatedCity.schema.optional[Person]("town", _.town).addHints(smithy.api.HttpQuery.unsafeApply("town")),

As a workaround I'm excluding the smithy.api namespace from the validated newtypes feature, but that doesn't seem like a bullet proof solution.

denisrosca avatar Mar 25 '24 17:03 denisrosca

I see. This is probably not specific to smithy.api, but a symptom of a more general problem: we should make sure that we refer to newtypes in the same way they were generated, even across codegen runs (e.g. smithy4s runs its own, then you have a library on top, then you have the user code).

If the default is "false" (unvalidated newtypes), then you can only go from false (nothing) to true (by setting the metadata key). Perhaps keeping the "previous value" in smithy4sGenerated metadata would be the way?

Shapes from a known previously-generated namespace would be referred to in the style they were generated, and the shapes in the currently-being-generated compilation unit would use the metadata key.

Regardless of whether we do it this way or another, I think smithy.api could get the validated treatment from day 1.

kubukoz avatar Mar 25 '24 21:03 kubukoz

If anything, this is yet another argument for rendering dynamic bindings instead of static ones. As much as I hate to say it, the arguments in favour of dynamic bindings are starting to largely outweigh the ones in favour of static binding.

Regardless of whether we do it this way or another, I think smithy.api could get the validated treatment from day 1.

I disagree (somewhat heavily). I appreciate that validated newtypes may be a desirable feature for users, but it's also not something I want to deal when writing unit tests in smithy4s that involve hints, and arguably the maintainers of smithy4s are the ones who are most expose to the shapes coming from the "smithy.api" namespace ... unless you could macro-ify the validation when dealing with literals.

Putting aside the "should we just give up and render dynamic bindings", what we could do is create a meta-trait that would trigger the render of newtypes as validated. The metadata flag would indicate that a model preprocessor should be run to add the meta trait to all relevant shapes. The meta-trait would be stored in the resource of published artifacts, alongside the generated code, and could be inspected by downstream code-gen runs (unlike smithy4s-prefixed metadata, which get eluded).

Baccata avatar Mar 26 '24 14:03 Baccata

it's also not something I want to deal when writing unit tests in smithy4s that involve hints

We discussed this briefly with Denis, I suggested to try and see how much stuff breaks: if it's more than a handful of locations let's keep them as-is, but if it's just a few I'd say unsafeApply is perfectly fine. Let's be pragmatic ;)

The metadata flag would indicate that a model preprocessor should be run to add the meta trait to all relevant shapes

hmm, so that way users could still apply the meta-trait on a case-by-case basis even if metadata is false / undefined. That'd be cool, I think.

The meta-trait would be stored in the resource of published artifacts, alongside the generated code, and could be inspected by downstream code-gen runs (unlike smithy4s-prefixed metadata, which get eluded).

Not sure what you mean here - I was thinking of using the smithy4sGenerated metadata, which does get inspected by downstream codegen runs.

kubukoz avatar Mar 26 '24 16:03 kubukoz

Not sure what you mean here - I was thinking of using the smithy4sGenerated metadata, which does get inspected by downstream codegen runs.

Well, thing is in order to use the generated metadata, you'd basically have to store a Map[ShapeId, Something], which is somewhat redundant where you could have a @something trait instead, which you could inspect from the shapes itself without having to thread the map where it needs to be inspected. It's arguably more idiomatic to smithy to use traits to store metadata related to specific shapes.

but if it's just a few I'd say unsafeApply is perfectly fine

To be transparent, my concern also has to do with the following aspects :

  1. binary compatibility. If you want smithy.api and alloy shapes to be validated newtypes, it'd unavoidably means moving this PR to 0.19, as it'll be breaking bincompat of generated traits.
  2. convenience of using hints in annotations. This concern is tied to my prototype https://github.com/Baccata/smithy4s-deriving. The inconvenience of validated types in hints would then be felt by end users.

Baccata avatar Mar 29 '24 11:03 Baccata

moving this PR to 0.19

It's already on 0.19 :)

kubukoz avatar Mar 29 '24 11:03 kubukoz

My bad, haha

Baccata avatar Mar 29 '24 12:03 Baccata

Putting aside the "should we just give up and render dynamic bindings", what we could do is create a meta-trait that would trigger the render of newtypes as validated. The metadata flag would indicate that a model preprocessor should be run to add the meta trait to all relevant shapes. The meta-trait would be stored in the resource of published artifacts, alongside the generated code, and could be inspected by downstream code-gen runs (unlike smithy4s-prefixed metadata, which get eluded).

I'm currently exploring this option and have a question about how the metadata flag would be passed to the generator.

I have a ValidatedNewtypesTransformer model transformer that, when enabled, adds a new @validateNewtypeConstraints (name is still up for debate) trait to the relevant shapes.

My initial attempt was to basically do the same thing we do for smithy4sRenderOptics (i.e. have a setting in the build plugin that gets written to generated-metadata.smithy), but that presents a bit of a problem. The issue is that the decision if ValidatedNewtypesTransformer should be executed needs to happen before loading the smithy model, but that information is in generated-metadata.smithy which is read by the model loader.

What do you think about making this setting a full-blown generator argument instead of passing it via the generated metadata file?

denisrosca avatar Apr 09 '24 13:04 denisrosca

What do you think about making this setting a full-blown generator argument instead of passing it via the generated metadata file?

You make a good point, it'd be silly to load the model twice. However, I'd like to avoid multiplying codegen arguments which need to be forwarded to 3 different "front-ends" (SBT/Mill/cli), in favour of allowing some level of configuration via the reading of a smithy-build.json file. Interestingly, there was a discussion this week about this.

Baccata avatar Apr 09 '24 15:04 Baccata

@Baccata I had a brief discussion with Denis, we're thinking to "just":

  • make the transformer run unconditionally (like e.g. AwsConstraintsRemover), but
  • make it check the metadata key before applying any changes

That should resolve the double-loading problem, and the configuration continues to live in the Smithy files. Did we get this right?

kubukoz avatar Apr 09 '24 20:04 kubukoz

That's sensible, actually !

Baccata avatar Apr 09 '24 20:04 Baccata

It's getting close 👍 just wanted to say that your efforts are much appreciated, and apologise for how long it takes me to review

Baccata avatar Jun 07 '24 07:06 Baccata

well, this is fun - I can't reproduce the failure locally. Merged series/0.18 into this one to see if that helps reproduce it, but no dice. Shall get back to it after the weekend...

kubukoz avatar Jun 22 '24 23:06 kubukoz

🎉

kubukoz avatar Jun 24 '24 23:06 kubukoz