core :duct/profile merge order is unintuitive

:duct/profile merge order is unintuitive

Open RickMoynihan opened this issue 4 years ago • 11 comments

The merge order of :duct/profile's does not work as expected.

Firstly one might reasonably expect that the merge order of profiles is determined by the order of profiles listed in the profiles argument to functions such as duct.core/exec-config, e.g. a user might expect the call:

(duct.core/exec-config config [:my.profile/b-one :my.profile/a-two])

to have the profile :my-profile/a-two meta-merge over the profile :my-profile/b-two. However the above profile will actually always be merged in the order of [:my-profile/a-two :my-profile/b-one] i.e. alphabetically on key name, due to the fallback comparator inside integrants key-comparator.

As duct profiles are almost modules one might try to force an ordering across them by establishing a dependency chain e.g:

  (require '[duct.core :as duct])

  (derive :example.profile/b-one :duct/profile)
  (derive :example.profile/a-two :duct/profile)

  (derive :my.app/requires :duct/const)

  (duct/prep-config {:example.profile/b-one {:a {:replace-me :init}
                                             :b [:b-one]}

                     :example.profile/a-two {:a {:replace-me :replaced}
                                             :b [:b-two]
                                             :c [:c-two]
                                             :my.app/requires (ig/ref :example.profile/b-one)}}
                    #{:example.profile/b-one :example.profile/a-two})

However, this doesn't work because any key deriving from :duct/profile has any #ig/refs they contain deactivated and converted into InertRefs here. Which means the subsequent call to fold-modules won't apply the profiles in the topological dependency order.

I should state that even if it worked I find the later solution significantly more confusing to reason about than an explicit ordering based on the order of the given profile keys. In particular in the case where you have multiple profile chains (with some shared and potentially optional profiles at various points in the chain e.g.

in dev merge in this order [:duct.profile/base :project.profile/customer :duct.profile/dev :project.profile/customer-dev :project.profile/local]
in test merge in this order [:duct.profile/base :project.profile/customer :duct.profile/test]
in prod merge in this order [:duct.profile/base :project.profile/customer :project.profile/customer-prod :project.profile/local]

This is really easy to reason about if each of the profile chains are specified like so; however when they're declared through dependencies they become a graph, and the intent is somewhat lost. Also I tend to think of profiles as maps that get merged with a well defined precedence to form a complete system; and don't think profiles depend on each other; they're just merged into a complete artifact.

On slack @weavejester mentioned there were 3 designs for modules

The current design, where a profile is a type of module. A design where modules and profiles are separate things, but at the same layer, or a design where modules are inside profiles, and normal configuration is inside modules.

I think the implementation is currently actually different from the intended design here, and is much more like the second suggested design. i.e. profiles and modules exist as separate things but at the same layer, e.g. (isa? :duct/profile :duct/module) ;; => false.

I'd certainly vote for a move towards profiles containing modules and normal configuration; with them being merged together prior to the system initiation in a well defined explicit and provided order. The risk with this move is that it might break peoples applications if they have defined custom profiles; however those users will currently be getting profiles merged in an alphabetical order. So perhaps we could effectively deprecate the current duct.core/exec-config but leave that merging profiles in alphabetical order, and create a new function duct.core/execute-config that applies profiles in the specified order?

An additional complication is what the semantics of keyword inheritance for profiles should be? e.g. we could possibly specify things like this:

(derive :duct.profile/prod-db :duct.profile/prod)
(derive :duct.profile/prod-routes :duct.profile/prod)

However when merging the profiles/refset for :duct.profile/prod the above the order will have to fallback to alphabetical between those profiles. Personally I think this should be strongly discouraged; as accidental orderings can easily arise, and that profiles should only use explicit keys.

Nov 06 '19 18:11 RickMoynihan

I think the implementation is currently actually different from the intended design here, and is much more like the second suggested design. i.e. profiles and modules exist as separate things but at the same layer, e.g. (isa? :duct/profile :duct/module) ;; => false.

Profiles are a type of module, and therefore probably should inherit from :duct/module for consistenty. The only reason they don't is that I wanted a way of running them before other modules, which necessitated giving them a separate hierarchy.

I'd certainly vote for a move towards profiles containing modules and normal configuration

The difficulty here is that modules and components also need to be separated, as their references are incompatible. So if we want to separate profiles from modules, we end up with a configuration three layers deep:

{:duct.profile/base
 {:duct.module/logging {}
  :duct.module/sql {}
  :duct.module/merge
  {:custom.service/echo
   {:log #ig/ref :duct/logger
    :db  #ig/ref :duct.database/sql}}}}

However, we could implement modules as metadata on the configuration:

{:duct.profile/base
 ^:duct.module/logging
 ^:duct.module/sql
 {:custom.service/echo
  {:log #ig/ref :duct/logger
   :db  #ig/ref :duct.database/sql}}}

There are a few advantages to this approach:

Profiles are more straightforward.
Profiles can contain modules.
Using metadata for modules visually looks like Java's annotations, or Python's decorators. Possibly therefore more intuitive?

The disadvantage is obviously that we'd break backward compatibility.

Nov 06 '19 19:11 weavejester

I'm not sure I follow all the reasoning of your proposal.

Firstly:

Profiles are a type of module, and therefore probably should inherit from :duct/module for consistenty. The only reason they don’t is that I wanted a way of running them before other modules, which necessitated giving them a separate hierarchy.

Yes I understand that, though this feels like a hack to me; as I don’t think profiles are like modules at all. I could be wrong on this, but it seems like you’re making them modules primarily for the perceived benefit of having as few core concepts as possible in duct. I definitely appreciate the desire to do this (we all love Scheme right? 😄), but I don’t think it pans out in practice. It's like you make them the same as modules only to then artificially force them to be handled differently (i.e. their evaluation happens at a different time, and profiles deactivate all their refs. Additionally I find the idea of managing profile merge order through integrants topoligical sort to be confusing.

I think I'd be more inclined to think of profiles as modules if duct itself didn't need to know about them, i.e. I think their presence in duct code like profile-keys, matches-profile? and their appearance as arguments to exec-config etc indicates that they're already a special type of concept, independent of modules. So I wonder if you want profiles and modules to both truly be the same then I think we'd want to look at designs that removed this knowledge from duct itself?

This said I don't in principle have a problem with duct having two concepts profiles and modules :-)

The difficulty here is that modules and components also need to be separated, as their references are incompatible. So if we want to separate profiles from modules, we end up with a configuration three layers deep:

I understand the references between modules and (system) components are incompatible; however I don't think you need to reflect the differences between profiles and modules in the config... i.e. can't profiles just be different from modules, handled specially by duct (they are already). Essentially have duct/exec-config find and initialise as integrant components all profiles via (ig/find-derived meta-config :duct/profile). Then meta-merge all the profiles together in the provided profiles seq order; before folding the modules to form the final system to init. Profiles would no longer be modules; they'd effectively just be implemented as derive :duct/profile :duct/const). Does this not work? I think the benefit of this approach is that the config could stay exactly the same... Though we may need a new var for duct/exec-config to account for the potential change in profile merge order.

I should also state I don't quite understand what :duct.module/merge is? Presumably every profile would need to declare their components nested in one??

However, we could implement modules as metadata on the configuration:

 {:duct.profile/base
  ^:duct.module/logging
 ^:duct.module/sql
 {:custom.service/echo {,,, ,,,}

I'm really not a fan of this approach, as it hides modules and makes them more magic. I think also nested metadata definitions may get confusing e.g. having a ^:displace inside module; or worse how would you ^:displace a module, given that ^:displace ^{:my.module/config {:module :config}} {:the/profile :here } actually puts the displace on the profile metadata. I don't know whether anyone would ever want to do this; but I think it highlights that under this design modules are then no "just integrant components".

Nov 07 '19 11:11 RickMoynihan

Yes I understand that, though this feels like a hack to me; as I don’t think profiles are like modules at all. I could be wrong on this, but it seems like you’re making them modules primarily for the perceived benefit of having as few core concepts as possible in duct.

It's partially that, but it's also partially that profiles are modules by definition. A module is a pure function that transforms the configuration. A profile is also a pure function that transforms the configuration, just in a more specialised way.

I understand the references between modules and (system) components are incompatible; however I don't think you need to reflect the differences between profiles and modules in the config... i.e. can't profiles just be different from modules, handled specially by duct (they are already).

I want to avoid placing two incompatible things in the same map, and ideally I'd like to ensure that modules can be specified per profile. One of the weaknesses of the current design is that you can't have per-profile module configurations.

I should also state I don't quite understand what :duct.module/merge is? Presumably every profile would need to declare their components nested in one??

It's a module that merges its content into the configuration. In other words, it's what profiles are now. If we're going to be going down the route of profiles → modules → components, then we need a module that merges in components into the configuration.

I'm really not a fan of this approach, as it hides modules and makes them more magic.

How does it hide them? They're visible in the edn configuration. I guess it hides them if you run (read-config), but why would you view the configuration via that function instead of looking at the edn file?

I think also nested metadata definitions may get confusing e.g. having a ^:displace inside module or worse how would you ^:displace a module, given that ^:displace ^{:my.module/config {:module :config}} {:the/profile :here } actually puts the displace on the profile metadata.

You can add metadata to a module's configuration. I don't think this hugely impacts readability, and in any case, it shouldn't be common use.

{:duct.profile/base
 ^{:my.module/foo ^:displace {:bar 1}}
 {:custom.service/baz {}}

I'm not sure why you'd want to displace or replace an entire map of modules. It sounds like something we'd want to prevent in general, as it leads to uninitutive results.

My inclination is that profiles should also not have top-level merge metadata, as that could implicitly change the merge order of profiles.

Nov 07 '19 14:11 weavejester

Metadata pros

Profiles are separate from modules, and can be ordered explicitly rather than topologically
Profiles can have different modules, and different module configurations
Briefer syntax (^:duct.module/logging) for modules that have no configuration
Looks similar to decorators/annotations, which can perform a similar role

Metadata cons

Metadata is invisible when (pprint config), unless we add metadata to the printer
Can't add merge metadata to whole profile, or whole module (though can add it inside)
Metadata inside metadata potentially confusing?
Metadata more "magic" looking?
If profiles are not modules, they can no longer implicitly add keys (e.g. :duct.core/environment)

Nov 22 '19 15:11 weavejester

I’m still not 100% sure I’m clear on the semantics of your proposal… How does it let one specify the priority ordering of profiles?

I think the main thing for me is that the semantics of profiles should be an explicit and very clear (and customisable) precedence order. I think also that an expansion from profiles → modules → components, makes sense… Though I’d like to clarify that I think profiles must also be able to contain components too.

I strongly feel those profiles should be merged in a specifiable order and explicit order. And there is non clearer to me than providing: [:profile/a :profile/b].

Whilst I’m less concerned at this stage with the syntax used and am more interested in ensuring profiles can merge in an explicit order, the proposed metadata syntax will I think be very, very confusing.

For example using an ataraxy module:

{:duct.profile/base 
^{:duct.module/ataraxy {"/" ^:example [:index] ,,,,}
 {:app.handler/index {,,,}}
:duct.profile/dev
^{:duct.module/ataraxy {"/dev" [:dev] ,,,,}
^{:more.modules/here {}}
^:another.module/here 
 {:app.handler/dev {,,,}} 
}

So now we’re doing a metamerge over meta-data, which will be more awkward to inspect and won’t pprint by default.

Also a more minor objection but more metadata will make it more awkward to get the syntax right; as it interferes with the natural :k :v :k :v rhythm of map syntax.

It also moves further away from the integrant foundations… and makes duct more bespoke syntactically than integrant.

Nov 22 '19 18:11 RickMoynihan

Thanks for getting back to me about this. Your feedback is really useful, and though I'm arguing for the metadata syntax, I'm currently undecided.

I’m still not 100% sure I’m clear on the semantics of your proposal… How does it let one specify the priority ordering of profiles?

The prep-config and exec-config would respect the ordering of the profiles you supply. So it would behave as you originally expected it to behave.

Whilst I’m less concerned at this stage with the syntax used and am more interested in ensuring profiles can merge in an explicit order, the proposed metadata syntax will I think be very, very confusing.

For example using an ataraxy module

I'd like to get away from modules that have significant amounts of configuration, for reasons I'll explain. The Araraxy module has been already removed from the Duct project template in favour of the Ataraxy router, as the introduction of prep-key in Integrant means that much of what the module does can be handled by a component.

Modules currently have their own options, but they can also read the configuration and make decisions based on the information they receive there. Part of the difficulty I've had with modules is figuring out what information to put in a module's options, and what information to store in the configuration instead.

For example, you can supply the database URL to the Duct SQL module directly:

{:duct.module/sql   {:database-url "postgres://localhost/test"}
 :duct.profile/base {}}

Or you can specify it by overriding the :duct.database/sql component:

{:duct.module/sql {}
 :duct.profile/base
 {:duct.database/sql {:connection-uri "postgres://localhost/test"}}}

My inclination is to favour the latter example. It's not significantly more verbose, and it makes explicit what the module is doing behind the scenes.

As I've developed Duct, my experience is that modules rarely need much, or indeed any configuration. It usually makes more sense for them to look at the current configuration to determine what to do.

So the fact that metadata makes it hard to make complex configurations doesn't seem like a significant disadvantage to me, as Duct has been moving away from that for a while. That said, I would be interested to hear about any modules that people have build that do require significant configuration.

It also moves further away from the integrant foundations… and makes duct more bespoke syntactically than integrant.

Let's take a quick look at the different options for a default configuration. This is the configuration currently:

{:duct.module/logging {}
 :duct.module/cljs {:main foo.client}
 :duct.module.web/site {}
 :duct.module/sql {}

 :duct.profile/base
 {:duct.core/project-ns foo

  :duct.router/ataraxy
  {:routes {[:get "/example"] [:foo.handler/example]}}

  :foo.handler/example
  {:db #ig/ref :duct.database/sql}}

 :duct.profile/dev   #duct/include "dev"
 :duct.profile/local #duct/include "local"
 :duct.profile/prod  {}}

This is how it would look using metadata:

{:duct.profile/base
 ^:duct.module/logging
 ^{:duct.module/cljs {:main foo.client}}
 ^:duct.module.web/site
 ^:duct.module/sql
 {:duct.core/project-ns foo

  :duct.router/ataraxy
  {:routes {[:get "/example"] [:foo.handler/example]}}

  :foo.handler/example
  {:db #ig/ref :duct.database/sql}}

 :duct.profile/dev   #duct/include "dev"
 :duct.profile/local #duct/include "local"
 :duct.profile/prod  {}}

Realistically, I'd probably add in a default for :duct.module/cljs based on the project namespace, so that module would lose its options as well:

{:duct.profile/base
 ^:duct.module/logging
 ^:duct.module/cljs
 ^:duct.module.web/site
 ^:duct.module/sql
 {:duct.core/project-ns foo

  :duct.router/ataraxy
  {:routes {[:get "/example"] [:foo.handler/example]}}

  :foo.handler/example
  {:db #ig/ref :duct.database/sql}}

 :duct.profile/dev   #duct/include "dev"
 :duct.profile/local #duct/include "local"
 :duct.profile/prod  {}}

Finally, this is how it would look like with a nested configuration:

{:duct.profile/base
 {:duct.module/logging {}
  :duct.module/cljs {:main foo.client}
  :duct.module.web/site {}
  :duct.module/sql {}

  :duct.module/base
  {:duct.core/project-ns foo

   :duct.router/ataraxy
   {:routes {[:get "/example"] [:foo.handler/example]}}

   :foo.handler/example
   {:db #ig/ref :duct.database/sql}}}

 :duct.profile/dev   #duct/include "dev"
 :duct.profile/local #duct/include "local"
 :duct.profile/prod  {}}

My thought is that the duct.module/base module would always be the first to be applied, and would just copy its option map into the configuration. So literally:

;; ensure all other modules come after :duct.module/base
(defmethod ig/prep-key :duct/module [_ options]
  (assoc options ::depends (ig/ref :duct.module/base)))

(defmethod ig/init-key :duct.module/base [_ config]
  (constantly config))

Which configuration style do you think seems the most accessible?

Nov 22 '19 21:11 weavejester

The prep-config and exec-config would respect the ordering of the profiles you supply. So it would behave as you originally expected it to behave.

Great 👍 👍

I'd like to get away from modules that have significant amounts of configuration

👍 Generally I prefer this approach; we moved to the router syntax in ataraxy a long time back; because of the limitations of the easier syntax.

As I've developed Duct, my experience is that modules rarely need much, or indeed any configuration. It usually makes more sense for them to look at the current configuration to determine what to do.

Ok that's interesting to hear. Whilst I agree with you that we rarely put any config in modules, I think I also have a use-case coming up for a module in our app (not as a library) which will need a relatively significant piece of config. It's really a data-macro scenario, where our app has multiple customer configs in it, and customers may have wildly different navigation, with different routes etc. I'd like to essentially move away from having to configure handlers etc, and move to configuring features in the navigation tree. I mention this only to say I think modules, as much as I'm wary of them, are also a very useful feature to support this kind of thing.

Anyway it's interesting that modules will have less config associated with them, that certainly counters some of my critique. Though I still think it's weird how much power will be hidden not just behind implicit mechanisms like prep-key but also then behind components hidden in metadata.

I definitely prefer the more explicit non meta-data nested design. One question, is :duct.module/base required to be present in every profile? i.e. is that the module that holds normal components within a profile, to preserve the separation of component types, or can you just put them in a profile too?

Thanks again for everything.

Nov 23 '19 00:11 RickMoynihan

One question, is :duct.module/base required to be present in every profile? i.e. is that the module that holds normal components within a profile, to preserve the separation of component types

Yes, that would be a requirement. I tried having modules and components in the same configuration, and it lead to too many issues, particularly around resolving references. I definitely want to keep them separate.

An alternative syntax is to separate them by key:

{:duct.profile/base
 {:modules
  {:duct.module/logging {}
   :duct.module/cljs {:main foo.client}
   :duct.module.web/site {}
   :duct.module/sql {}}

  :components
  {:duct.core/project-ns foo

   :duct.router/ataraxy
   {:routes {[:get "/example"] [:foo.handler/example]}}

   :foo.handler/example
   {:db #ig/ref :duct.database/sql}}}

 :duct.profile/dev   #duct/include "dev"
 :duct.profile/local #duct/include "local"
 :duct.profile/prod  {}}

Nov 23 '19 00:11 weavejester

Though I still think it's weird how much power will be hidden not just behind implicit mechanisms like prep-key but also then behind components hidden in metadata.

This is certainly a concern of mine. I think there's a balancing act between making configuration explicit but manually written, and implicit but automatically generated.

One thing I'm curious about; why do you say the metadata is "hidden"? Surely it's only hidden if you're looking at the config in your REPL? Does that happen often?

Nov 23 '19 00:11 weavejester

I think I too would rather use the non-meta data structure for modules. I'm neither familiar with Java's annotations nor Python's decorators so I can't relate to that point you made. As for the alternative non-meta syntax you proposed - I understand that, under the hood, it would still use the :duct.module/base, wouldn't it? If that's the case then I feel the benefit of separating modules and components by key.

If we see backwards compatibility much of an issue then maybe we could introduce a versioning key in config.edn? Something like what docker-compose.yml does (docs).

Also since we are talking about modules and components, I'd like to point out that the issue #29 potentially relates to this.

Nov 25 '19 16:11 werenall

One thing I'm curious about; why do you say the metadata is "hidden"? Surely it's only hidden if you're looking at the config in your REPL? Does that happen often?

Yes, I do pprint the config quite a bit. We have a complex multi-file configuration, with various top-level includes and profiles. So whilst you can guess what the result will be easily enough, when there are problems to debug it's really handy to just pprint the integrant.repl.state/config to check its contents are what you expect.

We've drifted somewhat from the standard duct project template, because we run a multi-tennant configuration; where each customer has their own set of profiles that get resolved by some wrappers over the duct functions, e.g. in dev we'd do (reset :customer-name) to change to that customers dev site. A meta-config function in our apps.main namespace essentially then assembles the top level duct config from the :customer-name key; so we're probably somewhat atypical of other duct users.

Dec 03 '19 16:12 RickMoynihan

core core copied to clipboard

:duct/profile merge order is unintuitive

Metadata pros

Metadata cons

core
core copied to clipboard