Drasil icon indicating copy to clipboard operation
Drasil copied to clipboard

Why does `phrase` have to look in the ChunkDB?

Open hrzhuang opened this issue 1 year ago • 2 comments

Currently the definition of phrase is as follows.

phrase :: NamedIdea n => n -> Sentence
phrase n = sentenceTerm (n ^. uid)

It constructs a Sentence that eventually instructs the printer to look up the UID in the ChunkDB for a term, which is then turned into a Sentence. But doesn't a NamedIdea already has the term as a field? Why are we doing UID lookup when we already have the thing we are looking in the ChunkDB for? I.e., why isn't phrase defined as follows?

phrase n = phraseNP (n ^. term)

For easy reference, the definition of sentenceTerm is as follows.

sentenceTerm = Ch TermStyle NoCap

Furthermore, why does sentenceTerm instruct the printer to not capitalize the term? As is, capSent . phrase does not result in a capitalized sentence. For example, in #3543 I have capSent $ pluralNP $ progName ^. term because capSent $ plural progName (which seems like a more intuitive way to write it) would not actually do the capitalization.

hrzhuang avatar Jul 18 '23 17:07 hrzhuang

This is a general theme in Drasil, but it's intentional. The symbolic UID usage there lets us refactor based on UID, ensure consistency across UIDs used, and do basic analysis using UIDs for where they appear and how. I believe one of the initial reasons was also to cut down on resources, I don't quite remember. @JacquesCarette or @smiths could probably fill this in for us.

I'm not sure about the capitalization, unfortunately.

balacij avatar Jul 18 '23 17:07 balacij

If I remember well (and this was quite a while ago, as I'm sure git blame will attest), we wanted phrase to not be so eager to remove all traceability information. We wanted what chunks were used to remain in the Sentence so that analysis phases could extract them.

The way it was implemented was possibly too crude. A better solution would be to record the uid and the term in what is produced. Furthermore the 'embedding' into Sentence should not require us to make premature rendering choices.

I think what this says is that the constructors of Sentence need to change (or, at the very least, be augmented).

The fundamental mistake is choosing between uid and term. There is no reason to not have both.

One might think -- why not embed whole Chunks in there? That way, no information is lost? Turns out that that level of polymorphism is really awkward to work with.

Excellent question!

JacquesCarette avatar Jul 20 '23 19:07 JacquesCarette