magnolia icon indicating copy to clipboard operation
magnolia copied to clipboard

Exclude params and cons from the derivation based on annotations

Open vigoo opened this issue 4 years ago • 35 comments

I should have opened an issue about this originally instead of discussing it immediately in a proposed implementation (https://github.com/softwaremill/magnolia/pull/270) so I wanted to move it here now to restart discussion.

The problem

The real-world example for the requested feature came up while trying to migrate the semi-auto derived binary codecs in https://github.com/vigoo/desert/ from shapeless to magnolia. It supports marking case class fields and sealed trait constructors as transient. These transient subtrees of a data type may have types for which there is no implicit binary codec available and it cannot be automatically derived either. Even though the transient annotations can be observed from magnolia's combine/dispatch methods, as the derivation engine does not know about these application-specific exclusion rules it tries to gather typeclass instances for these members too, and fails.

An example for using these transient annotations is when serializing Akka-typed messages for remoting but using a subset of the message type for local-only communication, transferring non-serializable references (for example an akka stream or some local resources).

In the original PR I proposed a solution and then we discussed some alternatives. I did not check yet how the feature could be implemented in the new Scala 3 version.

It would be nice to support directly this with annotations because that way desert could be moved to Magnolia without changing its users code but maybe some breaking changes there are still better than maintaining a hand-written shapeless and a scala3 macro version in parallel ;)

What do you think?

vigoo avatar May 20 '21 09:05 vigoo

Hi Magnolia folks. Have there been any thoughts on this? I'm working for a company that would like to adopt Desert to do long-term serialization (hence the need for schema migration) but the presence of Shapeless is a non-starter because of compile-time issues. Could this issue be solved in magnolia on a lower level?

deusaquilus avatar Aug 19 '21 21:08 deusaquilus

I don't think there has been any developments around this, but we're open to discussion and contributions :) I guess the main question is what would happen when calling caseClass.param.typeclass for a transient field? An exception would be thrown?

adamw avatar Aug 20 '21 07:08 adamw

I think that's reasonable. Just like calling .get on a None or a Failure.

deusaquilus avatar Aug 20 '21 12:08 deusaquilus

Ok, well I don't see a reason not to support such transient fields. We just have to come up with a good name (@deriveTransient?) and implement :)

adamw avatar Aug 20 '21 13:08 adamw

@adamw so you suggest to have a built-in @deriveTransient (or whatever) annotation for this. I think that would work and probably the simplest approach.

Two things come to my mind:

  • It should work both on fields and constructors
  • In desert transient fields have a default value for deserialization so I was using two different annotations: https://github.com/vigoo/desert/blob/master/desert-core/src/main/scala/io/github/vigoo/desert/transient.scala If there is a general purpose transient annotation in magnolia then the library could have a @default(value) annotation and then for transient fields it would require these two annotations. Only drawback is that verifying that both exists would move to the derivation while with a single annotation it forces the user to immediately fill it with the value. On the other hand this default value parameter is too specific for my library (or the domain of serialization at least) so probably shouldn't be Magnolia's concern.

vigoo avatar Aug 20 '21 13:08 vigoo

But the default would have to be per-typeclass, so I don't think it applies here.

A downside of an annotation is that it requires modifying the case class (derivation source), but then there's no other easy way of providing additional derivation metadata.

adamw avatar Aug 20 '21 13:08 adamw

One more thing came to my mind. If there is a global magnolia-defined transient annotation that would shortcut all derivations regardless of the typeclass which may not be a good idea. For example let's say we don't want to serialize transient fields with my serialization library but you may still want to derive a Show instance that works with the full data type.

To avoid this there could be some overrideable behavior in the derivation class

  • at least to enable/disable the transient feature for a given derivation
  • or, probably better, a isTransient(annotations: List[StaticAnnotation]): Boolean like function

The second would allow different libraries to define their own transient annotations which would also solve the typeclass-specific default value behavior.

Of course it may not be very likely that people are marking different subsets of their types for different type class derivations but still this feels like a safer/more generic solution to me.

vigoo avatar Aug 20 '21 13:08 vigoo

I would suggest something more general again 😄 https://github.com/softwaremill/magnolia/pull/279#issuecomment-766475143

joroKr21 avatar Aug 20 '21 14:08 joroKr21

@joroKr21 I like that :) Even better, maybe we could simply parametrise the derivation macro with a code that would decide, if a field should be ignored or not (and possibly provide a default value)? That way you could use annotations (which could be one of the implementations of the parameter provided by default), but also implement other, e.g. name-based strategies. The code in question would be evaluated at compile-time of course.

adamw avatar Aug 20 '21 15:08 adamw

I think that could work in Scala 3 where we have more options (by passing inline parameters and such) but not so much in Scala 2 where we can only pass additional type parameters to the macro.

joroKr21 avatar Aug 20 '21 15:08 joroKr21

I was thinking about Scala 3. I think the design is more or less sealed and "done" for Scala 2.

adamw avatar Aug 20 '21 15:08 adamw

Ah I was thinking about both Scala 2 and 3, because I think @deusaquilus 's use case is for Scala 2 for now.

@adamw thinking about the macro to decide if a field should be ignored or not - doesn't this lead us to an alternative where simply the typeclass provided for the derivation is lazy? That way the transient field support would be to simply check the annotations (or name etc) and does not access the derived typeclass at all. I'm not sure how hard would that be to implement though.

vigoo avatar Aug 20 '21 15:08 vigoo

Is lazy-derivation at compile-time possible? How do you do it?

deusaquilus avatar Aug 20 '21 15:08 deusaquilus

No I was just thinking it through better and realized that would not be possible. (At least I don't see how)

vigoo avatar Aug 20 '21 15:08 vigoo

Would @deriveTransient still be possibility for Scala 2? I think annotations, even non-configurable ones, are fine so long as they are simple and scarce. Coming from the DB world I have many, many scars from horrible things like @NamedQuery(<20 lines of code here>) but I don't think we need to religiously prohibit annotations based on those things.

deusaquilus avatar Aug 20 '21 15:08 deusaquilus

I think the discussion is not really about whether to use annotations or not, but rather how configurable this behavior is. And probably also about which version (Scala 3 or both 2 and 3) we can implement it.

I would be happy if we can release it for both versions (and happy to help implementing it) and would allow at least the level of configuration I suggested above, but also like the even more general configuration idea of @joroKr21 :) Not completely clear though where those configuration types would be defined. An inner type defined in the derivation object found by name?

vigoo avatar Aug 20 '21 15:08 vigoo

I agree that a more configurable design is very desirable for Scala 3 but it would be nice to have at least something for Scala 2. Perhaps we could worry about things like configurability and selective typeclass exclusion in Scala 3 but just have a bare-minimum implementation in Scala 2 that just does a single-field exclusion (i.e. for all derivations). So we could do @deriveTransient for Scala 2 and for Scala 3 that would just be a overridable default of some sort (hence it can be backwards compatible).

deusaquilus avatar Aug 20 '21 15:08 deusaquilus

I think the discussion is not really about whether to use annotations or not, but rather how configurable this behavior is. And probably also about which version (Scala 3 or both 2 and 3) we can implement it.

Yes, that's right - we would still need to mark the field in question in some way (in Scala 2 I can only think of annotations as feasible). The question is what should that annotation be? Something provided by Magnolia? Or should the library authors define their own. The configuration approach enables the latter. E.g. default makes sense mostly when deserializing something. Maybe for another typeclass there are different arguments needed to that annotation.

Not completely clear though where those configuration types would be defined. An inner type defined in the derivation object found by name?

That would be up to the library author. It just has to be passed as a type parameter to the macro. In Scala 2.13 it could be even a refined type directly there (because we have access to singleton types). That would also work in Scala 3 but we could come up with something better there.

joroKr21 avatar Aug 20 '21 16:08 joroKr21

I think the configuration approach is really elegant, it would give different authors the ability to define custom annotations for their derivations resolving many potential separation-of-concerns issues. My question is, what is the effort level that this would involve? Is it worth doing something this nice for Scala 2?

deusaquilus avatar Aug 20 '21 16:08 deusaquilus

...maybe in Scala 2 we can just do one simple annotation and in Scala 3 have it be configurable:

  // In Scala 3...
  object CsvConfig extends Magnolia.Config {
    type Proxy = Csv.type
    type Ignore = transient // (default: deriveTransient)
    final val readOnly = true
    final val minFields = 0
    final val maxFields = -1
    final val minCases = 0
    final val maxCases = -1
  }

deusaquilus avatar Aug 20 '21 16:08 deusaquilus

It does seem these are two separate issues/tasks really :) If anybody is up to implementing it, I think we can go with the config as @joroKr21 suggested for Scala2. For Scala3 I think I would prefer just passing some function which would decide whether to ignore / provide default or not (without mandating that annotations are involved, custom or not), as that's more flexible.

I'm away until 1st September but in case you'd have a PR @wojciechUrbanski or @mbore might step in, review, merge & release.

adamw avatar Aug 20 '21 16:08 adamw

Wait a minute, how is it possible to read value-level elements e.g. final val minFields = 0 from CsvConfig inside of the macro? You can CsvConfig fields as Tree elements from the TypeTag.tpe but you can't get their values, at least not from there...

deusaquilus avatar Aug 24 '21 17:08 deusaquilus

If it was a literal type e.g. type minFields = 0 that should be possible. There's maybe one other way you could do it but that involves multiple compilation units.

deusaquilus avatar Aug 24 '21 17:08 deusaquilus

Anyway, implementing this thing seems like a pretty big task for a fairly small feature that we need. In the least you would need to:

  1. Create genWith which I'm guessing would be an overloaded version of gen. This would probably involve some refactors.
  2. Grab the necessary fields from CsvConfig. Resolve the minFields etc... into actual values (as a mentioned before, these probably need to be literal-types), and then pass them into a configuration-holder of some sort.
  3. Pass this configuration-holder all the places that need it and...
  4. Implement the derivation-exclusion annotation functionality and pass the configuration for which annotation to check into there.

Is there any chance we could do something simpler just to unblock desert/serialization stuff? (I apologize for sounding like an impatient goon in advance...)

deusaquilus avatar Aug 24 '21 17:08 deusaquilus

Wait a minute, how is it possible to read value-level elements e.g. final val minFields = 0 from CsvConfig inside of the macro?

That makes it a constant with a literal type (for Scala 2.12's sake - on Scala 2.13 you could just say genWith[Config { val minFields: 0 }, Foo] - yes it's a bit restrictive to use only literal types and type members.

Is there any chance we could do something simpler just to unblock desert/serialization stuff? (I apologize for sounding like an impatient goon in advance...)

I don't think it's that difficult to implement but my OSS involvement hasn't been great in the summer months 😄 - as the season closes to an end I will probably find more time again. If I give it a go it would be the configurable version. But that doesn't stop somebody else from implementing the simpler version. It could then become a default if we make it configurable later.

joroKr21 avatar Aug 24 '21 17:08 joroKr21

Whoa, that actually works 😲 :

  object Foo {
    final val minFields = 0
  }
  val v: 0 = Foo.minFields

Pardon my ignorance...

deusaquilus avatar Aug 24 '21 21:08 deusaquilus

I played a bit with the Config concept, but have problem with obtaining a user defined implicit config, something like this:

  implicit val config: Config = new Config {
    override type Ignore = transient
  }

  implicit def deriveCsv[A]: Csv[A] = macro Magnolia.genWith[A]

and detecting the implicit

  def genWith[T: c.WeakTypeTag](c: whitebox.Context): c.Tree = {
    import c.universe._

    val config = c.inferImplicitValue(typeOf[Config]) match {
      case Literal(Constant(configValue: Config)) =>
        println(s"Found config: $configValue")
        configValue

      case other =>
        println(s"Found other: $other")
        Config.defaultConfig
    }

    Magnolia.gen(c, config)
  }

and this always produces

Found other: <empty>

Any idea why?

lukaszlenart avatar Oct 08 '21 09:10 lukaszlenart

Oh you want to have it implicit? In my imagination it would just be an argument to the macro like Magnolia.genWith[A, MyConfig]

joroKr21 avatar Oct 08 '21 11:10 joroKr21

I thought it will be easier ... looks like not :\ Now I'm trying to use the argument approach and it almost works, but just getting some implicit errors :\

lukaszlenart avatar Oct 08 '21 14:10 lukaszlenart

Still WIP but looks promising https://github.com/softwaremill/magnolia/pull/353

lukaszlenart avatar Oct 20 '21 13:10 lukaszlenart