Drasil icon indicating copy to clipboard operation
Drasil copied to clipboard

Create a Process/Data Flow Diagram Detailing a Complete Drasil Run

Open peter-michalski opened this issue 2 years ago • 29 comments

The process flow diagram should detail the stages/processes that occur when running Drasil.

Inputs, outputs, data, packages, and some functions may be noted in the diagram.

There will likely need to be several diagrams: an abstracted holistic diagram detailing the entire run, and several more diagrams for individual parts of the holistic diagram.

Developers will be the users of the diagrams.

peter-michalski avatar Apr 08 '22 18:04 peter-michalski

Sounds good @peter-michalski. We will likely also need to do some writing about the design, in addition to the diagram.

smiths avatar Apr 08 '22 19:04 smiths

The following are some wip process flow diagrams for

  1. Drasil build
  2. gen() for each example - part of main()
  3. genCode() for each example - part of main()
  4. genDot() for each example - part of main()
  5. genLog() for each example - part of main()

The diagrams are of a high level view. Feedback is appreciated for improving them. I will dig deeper into mkSRS() (called from gen()) and create a similar diagram.

Building-Drasil(1)

Drasil-Example-Main-gen(6)

Drasil-Example-Main-genCode(5)

Drasil-Example-Main-genDot(3)

Drasil-Example-Main-genLog(4)

fullSIsSI(2)

piSysPI

mkSRSDecl

peter-michalski avatar Apr 21 '22 16:04 peter-michalski

This looks like a good start @peter-michalski. To help with reading the figures, please add an explanation of the meaning of the arrows.

smiths avatar Apr 21 '22 23:04 smiths

They look nice! They all seem to approximately follow the same information flow too, which is great to see. Maybe we can create a generic version of them as well? I imagine we have at least 3 "stages" of the flow: input with a floating list of chunks/knowledge fragments, transformation with an ordered list of the notable transformers used in the transformation, and output with a floating list of produced chunks/knowledge fragments (perhaps with some overlap with the input as well, but that's okay).

It looks like the SystemInformation <- fullSI <- SystemInformation <- fillcdbSRS <- ... ends are also repeated, are we able to merge them into one picture on their own, and reference the created "SystemInformation"s instead? I really liked the dotted line across an arrow as well, maybe we can use that to show some "from" nodes are user-provided too?

balacij avatar Apr 26 '22 16:04 balacij

@balacij Thanks for the feedback, I have incorporated it into the diagrams.

Maybe we can create a generic version of them as well?

Could you please clarify this? Do you mean a 'more abstract' diagram? Maybe something similar to what is already in the Recipe Wiki (shown below)?

129757975-2f9f158a-ed1e-4e92-ad5a-535924bfba67

peter-michalski avatar Apr 26 '22 20:04 peter-michalski

Yes, a 'more abstract' diagram is what I meant, and that does look approximately like it too (but still a bit too specific and wide in scope)!

balacij avatar Apr 26 '22 21:04 balacij

Blank diagram (3)

@balacij I think this may be what you are looking for; showing chunks/knowledge relationships. Could you please clarify (and preferably give specific examples) what you mean by 'notable transformers'? Transformers are not mentioned much in the Drasil repo (just a few comments on mtl, lens, monad transformers, RuleTransformer)

peter-michalski avatar May 11 '22 18:05 peter-michalski

When I mentioned the "3 'stages'", I really meant "3 nodes", where the nodes are floating lists instead of actual diamonds/squares/shapes/etc. I hope I didn't cause any confusion with that. That diagram makes the chunk relationships really digestible, which looks very useful. Regarding the transformers, I mean the functions that we use to convert chunks into other useful chunks. For example, the large genSRS (generating an SRS from a ChunkDB) transformer relies on multiple smaller ones: converting TheoryModels into descriptive sections and tables, converting Exprs and ModelExprs into TeX through printing to a printing language first (so this one might be 2, https://github.com/JacquesCarette/Drasil/blob/adfc15f13c7cf81101c1dbab97a6aa61f10a4ec6/code/drasil-printers/lib/Language/Drasil/TeX/Print.hs#L99-L100 and https://github.com/JacquesCarette/Drasil/blob/adfc15f13c7cf81101c1dbab97a6aa61f10a4ec6/code/drasil-printers/lib/Language/Drasil/Printing/Import/Expr.hs#L111-L113), converting relevant chunks into a traceability matrix, etc.

balacij avatar May 12 '22 14:05 balacij

[I'm going to leave a series of comments, as there really are a series of diagrams to comment on.]

Building Drasil:

  • there's a conflation of several things going on here. The "for each package" steps are not Drasil-specific, but are rather unraveling what stack does. I find this information confusing, as I barely know the exact details of stack, and I've never needed to know them.
  • the order given here is not really canonical, but is the current order based on a linearization (aka topological sort) of the package dependencies. So the important graph here is that of dependencies.
  • I don't quite understand what process removes log folders?
  • Shouldn't there be another dotted line between "examples built" and "check logs for differences"?

JacquesCarette avatar May 17 '22 01:05 JacquesCarette

Main gen, but also these comments are mainly generic about many of the diagrams:

  • IO() is not a function (so shouldn't be in a rectangle); it's not quite data either... so shouldn't be in a parallelogram either!
  • putting () in the names of functions is not useful, the shape already gives that
  • giving the name of the functions in Haskell is not useful to figure out what the process is. The Rectangles should be abstract processes
  • just the name of the data-type (like Document and DocChoices) is not so useful, as their content and purpose is not clear.

In both that diagram and others, there are dotted arrows labelled "user provided data" (good!) that arrive right after a process (bad!).

Specific to genCode

  • Neither Name nor Description (bottom leaves at right) are no user-provided - so where do they come from?

genLog is kind of emblematic of what's not so useful about these diagrams: it tells me there's a function called genLog that gets information from a data-structure of type SystemInformation, and another called PrintingInformation (which itself is derived). The IO type tells me it works via side-effect. And that's it. In particular, I still don't know

  • what is a log
  • what are the side-effects that are done
  • what parts of SystemInformation are used (and why)
  • what parts of PrintingInformation are used (and why) Basically, the above corresponds to a shallow reading of the code. It does not correspond to understanding what the code does, nor its purpose.

JacquesCarette avatar May 17 '22 02:05 JacquesCarette

Reading further down: the "more abstract" diagram copied from Recipe Wiki is indeed mildly better. It doesn't go deeply enough, but at least it's at the level of the meaning of the code, instead of its low-level details.

JacquesCarette avatar May 17 '22 02:05 JacquesCarette

The Chunks/Knowledge diagram is more useful - but it should be automatically generated. The summer students provided various tools that output the necessary information to get here.

Much more useful would be an explanation of what's going on in that diagram. There is a point to drawing this diagram by hand, and that is to group together things to be able to give a rational explanation of what is 'really' going on here.

JacquesCarette avatar May 17 '22 02:05 JacquesCarette

I mean the functions that we use to convert chunks into other useful chunks. Converting TheoryModels into descriptive sections and tables

@balacij, thanks for the feedback.

There are quite a few functions that fit such a description in Drasil and I'm not sure how to best abstract away details in order to curate the list into something appropriate/manageable. Would most of the main functions of drasil-printers (expr, pExpr, genTeX, symbol, space, spec, modelExpr, literal, makeDocument, codeExpr, ...), drasil-docLang (mkNb, mkSections (mkToC, mkRefSec, ...) generateTraceTableView, toSentence, tmodel, ddefn,...), drasil-theory (ddE, newDEModel, im, ...) be in the diagram you describe?

peter-michalski avatar May 24 '22 14:05 peter-michalski

No problem! And, yes, assuming we're looking at the same functions, those are all good examples. Depending on what we're trying to get out of creating these diagrams, we might not need all of the example function names. I think that how @JacquesCarette questioned genLog is how we should be questioning the details of each transformer.

balacij avatar May 24 '22 15:05 balacij

DrasilDependencies

Building-Drasil(2)

Abstract of Drasil-Example-Main-genCode(2)

Abstract of printSetting

abstract-genLog(1)

abstract-Drasil-Example-Main-genDot

mkSRSDecl(1)

Abstract of fullSIsSI(1)

Drasil-Example-Main-gen(7)

peter-michalski avatar Jun 01 '22 00:06 peter-michalski

@JacquesCarette, the concerns brought up in your comments have been addressed in the new versions of the diagrams, found directly above this comment.

Responding to a specific comment:

giving the name of the functions in Haskell is not useful to figure out what the process is. The Rectangles should be abstract processes

I think that noting the name of important functions is important. This way the reader can easily compare the diagrams to the code and better understand processes at specific points in the code. After all, this was my original intention for the diagrams - to help the reader understand the low-level code, but from a more abstracted (and convenient) perspective than by looking directly at the code. I have decided to keep important function names, but have added text explaining the processes that are being addressed.

@balacij, I have not forgotten about your comments and have your suggested 3 stage (inputs, transformations, outputs) diagrams on my to-do list. In the meantime, if you have any comments regarding the diagrams above, please let me know.

peter-michalski avatar Jun 07 '22 20:06 peter-michalski

@balacij, I've started putting together a 3 stage diagram. I'm wondering if maybe it should just be a TeX table with 3 columns and a row for each transformer. The 3 cells of each row will contain the inputs, transformer, output, and maybe there will be some brief comments in the transformer cell. What do you think? Did you have something more graphical in mind?

I made a rough sketch below. Let me know if you think there are items that should be added (or removed) from the list, and if any other information is missing that you would like to see on there.

Information Stage Diagram

peter-michalski avatar Jun 08 '22 16:06 peter-michalski

It is probably best to only use a small handful of them, just enough that the reader can get the general idea of chunks being interpreted, created, or connected to form other meaningful chunks. We can generate all the rest later, if need be/desired. I think we can also merge them all into a single less concrete diagram with 2 bubbles with a large arrow from one to the other, and with floating lists of the relevant names for inputs/outputs/translators in the bubble/arrow/bubbles. What do you think?

balacij avatar Jun 10 '22 21:06 balacij

@balacij One bubble with a list of inputs, an arrow with a list of translators, and another bubble with a list of outputs? How would the reader know which input/translator/output go together? More importantly, do they need to know? I'm actually not sure, there are benefits to including this information as well as not including it. On the one hand it would be nice for the user to know what gets turned into what (and how), but a diagram that only includes lists of inputs, translators, and outputs is still fairly useful, and certainly easier to create, maintain, and cleanly present. It would show the transformers from a high level concept perspective and the user could always do their own research if they need the lower level details. I'll create such a diagram and it can then be further evaluated.

peter-michalski avatar Jun 14 '22 15:06 peter-michalski

translators

peter-michalski avatar Jun 14 '22 22:06 peter-michalski

I'm afraid that's not helpful.

Plus many of those 'translators' do really very different things. Some of them are pure constructors, while others do significant work.

The general investigation is good though:

  1. those inputs really do 'represent' something, and carefully documenting what that is, would be helpful
  2. same for the outputs
  3. figuring out the different classes of what you call "translators" by what kind of work they actually perform is also a good idea.

JacquesCarette avatar Jun 15 '22 18:06 JacquesCarette

@JacquesCarette I am assuming that this comment refers to the above translator diagram.

Do you have any comments on the other updated graphs? Should I go ahead and work on placing these in a Wiki?

peter-michalski avatar Jun 15 '22 19:06 peter-michalski

The module dependency graph is automatically generated and already on the web site, right?

The "building Drasil" remains not so informative, as the "snake" is just a topological sort of the dependency graph. The information you give for the stages mixes low-level information of detailed steps by Stack and more important 'visible' information that is useful from someone developping Drasil to know. You need to disentangle these.

As for the data-flow diagrams: these are probably accurate, but still quite uninformative. They reveal very little about what's actually going on! Basically the "data flow" inside Drasil just isn't very revealing. What is revealing is

  • the details of the intent of the data stored in the larger structures (Choices, SystemInformation, CodeSpec). The actual details of the representation are not so interesting. The key word here is 'intent'. There's a modeling relation between the data-as-stored and the data-represented. That's the interesting part.
  • a kind of control-flow diagram with the actual steps of the processing would likely reveal quite a lot more

Basically: this information you show is much too verbatim. You need to do a non-trivial interpretation of the code to provide what the 'interesting' bits are, and elide the uninteresting details.

JacquesCarette avatar Jun 15 '22 19:06 JacquesCarette

The module dependency graph is automatically generated and already on the web site, right?

No, this dependency graph was "manually" created using a web resource. The graph on the Wiki is quite different, omitting several packages while including example as a package. It was added in 2019 and it might be outdated?

Suggested changes for "building Drasil" are noted. I will work on this.

As for the comments regarding data-flow diagrams, I think that maybe we have different audiences in mind here. As mentioned in an earlier comment, I had intended for the diagrams to be fairly low level, but with enough abstraction to comment on what the code needs and what the code does. In this perspective, I find the above diagrams fairly useful in quickly understanding the code. That being said, changes to the code will make those diagrams obsolete in short order. They may be useful for developer orientation (at least I think), but only in the short term. What you are looking for is something more abstract, which will be useful (in a different way) to developers, as well as non-developers (my graphs fall short here - people trying to gain a quick understanding of moderate level Drasil transformations would not benefit). Instead of updating the above graphs I will create a second set with this perspective in mind. Maybe we can find uses for graphs with several views.

peter-michalski avatar Jun 15 '22 22:06 peter-michalski

I'm not surprised the graph on the wiki is very out of date. I was thinking there was one on the main site but there isn't. Also: the graph would be much more readable if the transitive dependencies were not shown. It should probably be generated automatically too.

Your description of graphs that are useful to understand the code beyond the straight syntactic flow is good. The "what the code is an encoding of" graph would be quite useful indeed.

JacquesCarette avatar Jun 16 '22 15:06 JacquesCarette

Drasil-Example-Outline

Drasil-Example-Gen

Drasil-Example-GenCode

Drasil-Example-GenDot

Drasil-Example-GenLog

peter-michalski avatar Jun 20 '22 22:06 peter-michalski

@JacquesCarette and @balacij are the experts, but I find the abstraction level used here seems appropriate to me. I think it will help people understand what is going on better than the more detailed data flow diagrams.

smiths avatar Jun 21 '22 03:06 smiths

Plus many of those 'translators' do really very different things. Some of them are pure constructors, while others do significant work.

The general investigation is good though:

  1. those inputs really do 'represent' something, and carefully documenting what that is, would be helpful
  2. same for the outputs
  3. figuring out the different classes of what you call "translators" by what kind of work they actually perform is also a good idea.

Aside(?): Czarneckis "Overview of Generative Software Development" paper refers to the input/translator/output triple as a "generative domain model" (a problem space -> mapping -> solution space), and he notes at least 2 kinds of the "translators" (he calls them "mappings"): configuration view and transformation view. Would it be fair to call the constructors listed, "configuration views"?

I think we could try to add to his list of "view" kinds: lenses and introspection, at least (?).

balacij avatar Jul 05 '22 16:07 balacij

Good points @balacij .

JacquesCarette avatar Jul 05 '22 17:07 JacquesCarette