nix Greatly expand architecture section, including splitting into abstract vs concrete model

I want to complete specify the store layer.

The concrete vs abstract split I would expect to be controversial, so here my rationalization for it:

Explaining everything at once makes for a huge tsunami of information that is highly likely to overwhelm the reader, even if they are "random accessing" the reference rather than reading it in order, end to end. However, if one learns the abstract model first, then they can develop their mental model for the "skeleton" on which the concrete details is the "meat". Rather than appearing as a wave of assorted details, the details make sense as a combination of the requirements of the abstract model with only a few arbitrary choices + historical evolutions mixed. That makes for a "compessed" mental representation (the concrete model encoded against the abstract mode) which is a lot less mentally taxing.

In the language of https://documentation.divio.com/'s documentation quadrant, the abstract model is pure explanation, describing what we are trying to do with out the constraints of how Unix and software for unix work today. The concrete model is mainly reference, with just enough explanation to tie it back to the abstract model.

This order of information I think matches how @edolstra designed Nix in the first place (see patterns in functional programming that can be reused for new purpose, design system accordingly), and also matches how I think most advanced users / developer Nix think about it. I thus think it is both a proper and a "proven" strategy for how "bottom half" of the documentation quadrant can be conveyed.

Aug 06 '22 00:08 Ericson2314

Both reviews should have probably been submitted to #6420, right?.. (It's highly confusing to have 2 PRs for the same topic - are there more?)

Aug 07 '22 23:08 toraritte

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/summer-of-nix-documentation-stream/20351/4

Aug 08 '22 00:08 nixos-discourse

Both reviews should have probably been submitted to https://github.com/NixOS/nix/pull/6420, right?.. (It's highly confusing to have 2 PRs for the same topic - are there more?)

Well this one is supposed to be pick up against the last one. Even before I did the abstract vs concrete thing, I made the separate branch to have more content at the cost of less polishing.

Aug 09 '22 03:08 Ericson2314

I'm not convinced that having a big architecture document is the way to go. I think it would be better to have more "literate", cross-referenced comments in the source code in combination with doxygen (or whatever). For example:

reference-scanning.md should be a doxygen comment in references.hh.
The description of the NAR format can go into archive.hh.
Not sure what the best anchor would be for describing the store path hash computation, but probably we should move all makeStorePath() and friends out of store-api.cc into their own source file and then we can keep the docs there.
The description of the build algoritm can go into libstore/build.

I also don't think we should have Haskell-style pseudo-code that reproduces C++ types, like this:

data DerivedPath
  = OpaquePath { path : StorePath }
  | BuiltPath {
      drv    : StorePath,
      output : OutputName,
    }

Given how much churn there has been in these types on the implementation side, having to constantly update an architecture document to reflect implementation changes would be painful. It is hard enough to keep comments in sync with the source, but the architecture document is likely to become outdated immediately. Again I think it would be preferable to have doxygen comments on the C++ types that describe their function.

There is also a lot of terminology here that doesn't appear in the implementation, which is likely to confuse people, e.g. "references are capabilities", FSOs, etc. Also, there are some reflections that would be more appropriate for an academic paper, like monadic vs applicative build systems and comparisons to Von Neumann/Harvard architectures.

Aug 12 '22 08:08 edolstra

As a pure tooling advice, there is handy: https://github.com/JoelCourtney/mdbook-kroki-preprocessor

Aug 12 '22 13:08 blaggacao

@edolstra the entire point of having a specification document is that it is separate from the implementation. I want people to know the the Nix store layer works without having to read any C++ or even Doxygen.

Aug 12 '22 14:08 Ericson2314

Given how much churn there has been in these types on the implementation side, having to constantly update an architecture document to reflect implementation changes would be painful.

A lot less painful that doing it out of tree though!

Also, I think the type changes have been mainly replacing strings and more std::variant and thus are by and large converging on new stable design. I expect new changes to be not from refactoring existing features once they are cleaned up, but adding new features like "deep content-addressing".

It still benefits being in-tree in that case so we can update the spec and implementation together in the new feature case, but it is not so onerous that it is futile to have a spec at all.

Aug 12 '22 14:08 Ericson2314

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/summer-of-nix-documentation-stream/20351/2

Aug 15 '22 12:08 nixos-discourse

doxygen

@edolstra Do we have infrastructure to make this visible on a par with the manual? As long as we don't, I'm strongly in favor of adding this type of material to the manual. I'm agnostic to how we build this, if the result is easily discoverable and not too expensive in terms of setup and maintenance.

I agree that this kind of thing should be right next to the code, but we can still move it in there later. What's more important is that it's written down at all, and easy to find, read, and modify.

I also don't think we should have Haskell-style pseudo-code that reproduces C++ types

@edolstra Agreed. The architecture spec should show the principles, not the implementation details. Actually we should not use any kind of specific code to represent those principles, but ideally something more universal (and self-explanatory) such as UML diagrams (no specific proposal, just an example) where it's suitable. Even if Haskell or whatever is more precise, we cannot assume people to know any of this. Learning also has dependencies, and we should make them explicit and keep the closure small.

There is also a lot of terminology here that doesn't appear in the implementation

Concerning really abstract stuff that doesn't touch on what's really going on in the implementation I agree, but I suppose some things can have more structured naming. Sure, the term FSO is not used in the code, but why not change the code?

@Ericson2314 I'm not convinced by the abstract/concrete setup. There is overwhelming evidence that people don't learn well from first principles. The manual, even the architecture spec, should describe Nix specifically, and nothing else. As suggested above, we can squint a little bit or add some wishful thinking about code factoring and naming to make things appear less thorny, but we should stay with what there is.

While I believe all of what you wrote is true, it is often hard to read and understand. And because we made the mistake already in https://github.com/NixOS/nix/pull/6420, I think we should go through this with multiple PRs section by section (as outlined here), even if it is more work for authors. It's really really hard to review such a large amount of text, and we have to take reviewer's effort into consideration as well.

@edolstra If you could please revert https://github.com/NixOS/nix/commit/81e101345fda2a8651c470f08b364a1ca6fa37cf we can do exactly that for #6420 and make a PR for each section topologically sorted, now that we have developed an idea on the general direction we would like to take.

Aug 17 '22 15:08 fricklerhandwerk

The point of the data types is that it is the interface, not just a mere implementation detail.

I am fine replacing with something that doesn't look like code, but let's be clear what this is achieving. People have a bias "programing language = implementation", so we are simply laundering the exact same information in a format that seems more "platonic".

If we understand that we're just trying to find a presentation that plays nicely with people's biases, great, we're on the same page. If we think there actually is an objective rather than rhetorical problem with data types in the spec, then we're still talking past each other on what a "specification" even means.

Aug 17 '22 15:08 Ericson2314

People have a bias "programing language = implementation", so we are simply laundering the exact same information in a format that seems more "platonic".

@Ericson2314 when we worked on #6420 we were pretty clear in that regard. We used non-code interface descriptions to be more universally approachable and simplified them to a point where the principle became visible. Sure, if we rewrote the Store in Haskell, that would be a moot point. But that's not what Nix is today.

It's not about dumbing down, it's about making it easy to understand with minimal prerequisites.

Aug 17 '22 17:08 fricklerhandwerk

@fricklerhandwerk but that PR did contain data definitions, like https://github.com/nix-community/nix/blob/39d32ac4c63f4aa3784d114b19c0eca83e306ca9/doc/manual/src/architecture/store/fso.md This PR adds more of them, but it uses them in exactly the same way: to specify Nix, irrespective of how it is implemented.

Aug 17 '22 21:08 Ericson2314

We can make up a UML with some types, but we cannot simply refuse to present this information without loosing a tone of clarity. All software is best understood via its data model; there is simply no way around this.

Aug 17 '22 21:08 Ericson2314

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/a-proposal-for-replacing-the-nix-worker-protocol/20926/16

Aug 21 '22 16:08 nixos-discourse

This PR extends the manual by 22% in terms of words, and has been in limbo for 1,5 years. What can we deduce from that?

Large PRs don't work. (Also note this is draft)
Some 20% of the manual is missing and can't be linked to. This makes all other doc contributions worse.
Either
- John needs to work harder to split out chunks of this PR into smaller ones and push the team harder to review those.
- We need a less perfectionist attitude so that it's easier to make progress on docs

If I may oversimplify and extrapolate:

Average pace is 2000 words per year since inception.
Average pace in the last two years is 3250 words per year
This single PR adds 9000 and would take 4.5 years based on that pace, and that's assuming it's the only PR we care about, and it's not even complete yet.

If you allow me to be very hand-wavy, I would

assume that this is an outlier because we mostly don't have architecture oriented docs yet. Divide by 2
assume that this is the biggest PR we need
assume that we need 20-30 significant PRs
assume that their size distribution is similar to letter frequency

Now my napkin says that the manual needs to double in size.

So if we don't speed up our process, the Nix manual won't be anywhere near completion for another 12 years, assuming no new features or behaviors.

My suggestion would be to accept additions leniently (focusing on correctness and perhaps duplication; not much else), merge into master, and perform all other editing as a separate process that produces many small PRs that are easily reviewed. This way we break the perfectionism barrier, and everyone gets to enjoy the reward of making actual change. Documentation is largely a volunteer-flavored effort, so while enjoyment and reward may not be the absolute top priorities for docs, I believe it's crucial for keeping contributors and reviewers motivated and speed up the process that way.

Apr 18 '24 17:04 roberth

I agree with the concern, and can only add my observations:

progress is proportional to time spent
we start way more things than we finish (which is in part due to attempts to address immediate needs and then falling into rabbit holes to fix underlying issues)
pair and mob editing is a great tool to keep momentum and focus, and fulfills social needs that usually fall short

We had our peaks when we met more often to work together. I think making Nix documentation better is a primarily economic problem, because by now we know exactly what to do and how.

Apr 18 '24 17:04 fricklerhandwerk

That is also true. We currently have a process that leads to quality documentation, if incomplete at this velocity.

I do not know how these processes compare. I've only found some numbers that indicate very clearly we ought to do something.

Apr 18 '24 17:04 roberth