Drasil `State`ful artifact rendering: Verify it makes sense to implement (and implications), design, and implement

(TODO: @balacij will return to this ticket on ~~Mon. July 22nd~~ Tues. July 23rd, 2024 and write an actual description).

Unblocks (related 'problem' symptoms):

#806
#946
#1235
#1415 (related to #946)
#3796

Jul 21 '24 18:07 balacij

NOTE: I'm going to write a few different comments discussing different issues/symptoms.

I'll start with a code snippet:

https://github.com/JacquesCarette/Drasil/blob/5705da6c270bd1839801037c71a5e8466072e655/code/drasil-docLang/lib/Drasil/DocumentLanguage.hs#L193-L211

It seems harmless, right? The answer should be "yes" because that's one of the key "transformation/render" points where we generate the SRS. The issue is that we abuse it:

λ ~/Programming/Drasil/code/ noInterMapDupe rg "mkSections" -ths
drasil-docLang/lib/Drasil/DocumentLanguage.hs
71:  mkSections fullSI l where
96:    allSections = concatMap findAllSec $ mkSections si $ mkDocDesc si dd -- FIXME: `mkSections` on something particularly large that is immediately discarded is a sign that we're doing something wrong. That's in addition to `mkDocDesc`...
128:    allSections = mkSections si $ mkDocDesc si dd

We have 3 references to it!

3 References to `mkSection`

(1) Generating the Abstract SRS Documents

https://github.com/JacquesCarette/Drasil/blob/5705da6c270bd1839801037c71a5e8466072e655/code/drasil-docLang/lib/Drasil/DocumentLanguage.hs#L66-L73

(2) Finding all "Sections and LabelledContent" for inputting into the `ChunkDB`

https://github.com/JacquesCarette/Drasil/blob/5705da6c270bd1839801037c71a5e8466072e655/code/drasil-docLang/lib/Drasil/DocumentLanguage.hs#L89-L118

(3) Finding all "References" for inputting into the `ChunkDB`

https://github.com/JacquesCarette/Drasil/blob/5705da6c270bd1839801037c71a5e8466072e655/code/drasil-docLang/lib/Drasil/DocumentLanguage.hs#L120-L149

Symptom A

Immediately relevant issues/PRs:

#4022
#4023

As I wrote in the second code snippet:

FIXME: `mkSections` on something particularly large that is immediately discarded is a sign that we're doing something wrong. That's in addition to `mkDocDesc`...

We are effectively generating an abstract copy of the entire SRS document 3 times in the code, with all 3 used in code generation, in different ways. (2) and (3) are 'assistive' for (1) -- they search for references to all Sections, LabelledContent, and References and insert them into the ChunkDB. When (1) is finally rendered into a concrete SRS artifact (i.e., HTML, Jupyter, mdBook, $\LaTeX$), UID references are all resolved, which is why (2) and (3) are helpful -- they automate insertion of the References and LabelledContent, and whatever else, into the ChunkDB for (1) to resolve content for.

Apr 10 '25 18:04 balacij

Symptom (B)

Now let's look at: #1235 and #1661. This issue is also related to why we have usedinfodb (also discussed in #3260 and more in https://github.com/JacquesCarette/Drasil/issues/1661#issuecomment-1021450950) -- a hack solution that requires manual collection of the terms with acronyms presented in the generated SRS documents.

How could we fix this today? Similar to the issues related to Symptom (A), we could generate the entire abstract document with mkSections and figure out which acronyms are actually referenced.

The issue with this is that it would exacerbate Symptom (A) and potentially computational waste if we do it in the same way as the earlier referenced code.

Apr 10 '25 18:04 balacij

Symptom (C)

Now let's look at: https://github.com/JacquesCarette/Drasil/issues/806. (Actually, in this ticket, I jump to the solution presented in the title, but I forgot about that until now....)

We want our generated bibliography to be ordered according to order of presentation. Again, if we wanted to fix this today, we could follow through with the same way with mkSections.

https://github.com/JacquesCarette/Drasil/issues/1415 and https://github.com/JacquesCarette/Drasil/issues/946 are very similar. The way to fix them today is to scan our mkSections and use that information as necessary.

Apr 10 '25 18:04 balacij

Problem

The real problem is that the content of our documents has scopes, and said content wants to be aware of its scope. For example, the bibliography is "top-level/meta"-content/information that wants to be aware of everything else. The document outline and global tables of $X$ are similar.

Local tables of $X$ are also similar, but within smaller scopes. For example, when rendering our IM/TM/GD/DD boxes, when we present the list of variables, they want to be aware of the variables in scope of the equations presented within the box and to present only the ones immediately used in said box.

Our current strategy to dealing with this relies on some hacks:

Scanning mkSections and discarding results.
Conflating display knowledge with math knowledge and assuming that it will be displayed -> https://github.com/JacquesCarette/Drasil/blob/5705da6c270bd1839801037c71a5e8466072e655/code/drasil-theory/lib/Theory/Drasil/DataDefinition.hs#L101-L104

Apr 10 '25 18:04 balacij

My head's starting to spin with this issue because of its sheer scale. There are a bunch of issues that are coming down to the same problem, and now I'm having a hard time analyzing everything together, coming up with a solution, and verbalizing it coherently.

There's also https://github.com/JacquesCarette/Drasil/discussions/3796#discussioncomment-9824376, which is something of a "Symptom (D)" and where I built a prototype solution that we could extend.

I'm spending too much time on this, so I'm going to stop myself. @smiths @JacquesCarette if either of you have any thoughts on this, that would be appreciated, otherwise, I will put this on the backburner for a little bit and gather my thoughts slowly.

Apr 10 '25 19:04 balacij

The real problem is that the content of our documents has scopes, and said content wants to be aware of its scope.

I think this is a valuable insight. Luckily, phrasing it that way also helps, because it also leads to standard solutions from PL.

We need to rethink

What the 'assembly' process for a document should be
What we think 'document pieces' should be

Maybe we should strongly leverage programming language ideas: hierarchical name spaces, context polymorphism, static and dynamic scopes (and scope resolution), etc.

We could also seek inspiration from XML/XSLT (and HTML/CSS). And maybe from LaTeX and/or typst and/or org-roam.

In other words: what is a good programming language for

declaring re-usable parts of documents
assembling whole documents from parts Note that I'm not assuming that this is the same language. I am assuming that both as DSLs.

Apr 16 '25 19:04 JacquesCarette

`State`ful artifact rendering: Verify it makes sense to implement (and implications), design, and implement

3 References to mkSection

(1) Generating the Abstract SRS Documents

(2) Finding all "Sections and LabelledContent" for inputting into the ChunkDB

(3) Finding all "References" for inputting into the ChunkDB

Symptom A

Symptom (B)

Symptom (C)

Problem

3 References to `mkSection`

(2) Finding all "Sections and LabelledContent" for inputting into the `ChunkDB`

(3) Finding all "References" for inputting into the `ChunkDB`