Drasil Continue building UID system

This issue will probably take a good chunk of time to complete, but I think we will want to start getting rid of functions that can overwrite information within a given UID. I originally set out to try and fill in more fields of the database automatically (#2775), but the concepts, quantities, and qdefinitions and modelkinds fields have the potential to give conflicting information to the same UID, which could result in some weird behaviour.

mkQuantDef' is defined as follows:

-- | Used to help make 'QDefinition's when 'UID' and 'Symbol' come from the same source, with the term separate.
mkQuantDef' :: (Quantity c, MayHaveUnit c) => c -> NP -> Expr -> QDefinition
mkQuantDef' c t = mkQDefSt (c ^. uid) t EmptyS (symbol c) (c ^. typ) (getUnit c)

Which gives it the ability to overwrite a lower-levelled chunk's NP when I don't think it should. It's used most often to make general chunks more precise (ex. The pendulum examples use a general acceleration chunk and replace its general term with one that makes sense in the context of a pendulum). Modifying the data assigned to a UID without modifying the UID itself could then result in unexpected terms popping up. To follow up the pendulum example, a user can no longer use the generalized definition of acceleration because that information has been lost and replaced with the specific one.

I think this sort of problem is very similar to #1116 and #91. There are also some similar constructors defined in drasil-theory for ModelKind (see #2775)

Edit: This issue can be split up into the following:

[ ] Change UID to a newtype
[ ] Change format of constructors - some should take UID, some String, and some should append the ► symbol
[ ] Implement the UID checker function

Aug 13 '21 15:08 Ant13731

Which gives it the ability to overwrite a lower-levelled chunk's NP when I don't think it should.

This is very relevant to #2388. It makes certain chunks unreachable (overwritten in the UID maps).

Instead of getting rid of mkQuantDef' and similar functions, would it suffice, in general, to, when inheriting a UID, transform the originating UID in some way? For example, with mkQuantDef', we could just append "QD" to the end of the UID. This would likely require us to go in and edit the existing code to make sure that our references are pointing to the right things. The UIDs are never really shown to end-users, so I'm not quite sure if we care about them being pretty.

Aug 13 '21 15:08 balacij

We indeed to not care about UIDs being pretty. Right now they are short, but I always thought they might 'morph' into something much more structured than just a String, as the need arose.

I would probably prefer to use something better than just appending QD. Too dangerous. Maybe use a custom append function (for uids) that also inserts some obscure unicode character as a 'link' that isn't likely to be used by people (at least not by people who aren't explicitly trying to break things)?

Aug 17 '21 19:08 JacquesCarette

Right now they are short, but I always thought they might 'morph' into something much more structured than just a String, as the need arose.

I can definitely see something else potentially needed/wanted. Something else might also show a clear development trail scheme for some things.

I would probably prefer to use something better than just appending QD. Too dangerous. Maybe use a custom append function (for uids) that also inserts some obscure unicode character as a 'link' that isn't likely to be used by people (at least not by people who aren't explicitly trying to break things)?

Sounds good to me. We could then enforce that the character also isn't in the String (UID) already when creating a new UID, and that it's not in an appended String when creating a derivative UID. Perhaps a right arrow, ►, or a triple nested-greater than, ⫸?

@Ant13731, what do you think?

Aug 17 '21 20:08 balacij

Since one might want to append more than once, we can't check if that character isn't there. We could do something more clever if we changed UID to be a newtype, then we could trust things that are already UID but not raw strings. I have no strong feelings on which character to use.

Aug 17 '21 20:08 JacquesCarette

Yep, sounds good to me. Should each ModelKind function get its own unicode character as well?

Edit: So would the first step be to make UID a newtype? Edit 2: Since we're changing UID from a String to a String with a constructor, the type signatures of all the chunk constructors will change. Should these take in a String then instead of a UID?

Aug 18 '21 12:08 Ant13731

I think that just 1 special character is enough, meaning that whatever it's tagged to was derived from some chain of things. For example, a►b►c would mean that c was derived from a►b, and that b was derived from a.

I believe the constructors should be taking a UID still.

Aug 18 '21 14:08 balacij

Ahh okay, I see.

I believe the constructors should be taking a UID still.

If UID turns into a newtype, would that require the user to give an input of UID "actualUID" instead of just their "actualUID"? So making a chunk would look like this:

myNewChunk :: NamedChunk
myNewChunk = nc (UID "chunkUID") chunkTerm

rather than this:

myNewChunk :: NamedChunk
myNewChunk = nc "chunkUID" chunkTerm

I was thinking that if we take something that will eventually be a UID as a String first, then we know that we have to run it through something to check that there are no invalid characters (like ►) when the user initially gives their UID to the database. And then constructors that build on top of other chunks won't need a string since the UID will be passed through the chunk. So something like the NamedChunk smart constructor would look like:

nc :: String -> NP -> NamedChunk
nc s = NC (UID uidCheckerFunction s)

while the IdeaDict wrapper function would remain the same (with no function to check the UID):

nw :: Idea c => c -> IdeaDict
nw c = IdeaDict (NC (c^.uid) (c^.term)) (getA c)

But I'm not really sure what the checking function would look like, or if that would even work.

Or, thinking of #2775, what if we give the UID type something that takes in a list of its 'chunk types'. If we can do something along those lines, Drasil could then know its own type structure. This implementation wouldn't work exactly, but something along the lines of:

data UID = UID {
  _getUID :: String,
  chunkType :: [ChunkType]
}
data ChunkType = NamedIdea | IdeaDict | CI | ConceptChunk | QuantityDict | ...

Hmm on second thought, I don't think it would be that great of a change, but I'll still leave the idea here anyways.

Aug 18 '21 15:08 Ant13731

As a side note, while I was fixing alll of the UID things to work with newtype, I came across this:

performance       = dcc "performance"       (nounPhraseSP "performance")
  "the action or process of carrying out or accomplishing an action, task, or function"

performanceSpd    = dcc (uidToStr $ performance ^. uid) (nounPhrase'' (phrase performance) (S "speed") CapFirst CapWords)
  "the action or process of carrying out or accomplishing an action, task, or function quickly"

Just leaving it here as a note since performanceSpd looks a little off, especially with the reuse of uids and completely different phrase and plural forms.

Aug 18 '21 15:08 Ant13731

If UID turns into a newtype, would that require the user to give an input of UID "actualUID" instead of just their "actualUID"?

Yep, though, it would probably best to use smart constructors to ensure length + not containing bad character.

I can see it being (effectively, but there's probably a better way of encoding this):

newtype UID = UID String

uid :: String -> UID
uid s
  | '►' `elem` s = error "► not allowed in UID"
  | null s       = error "UID must be non-zero length"
  | otherwise    = UID s

derivUID :: HasUID a => a -> String -> UID
derivUID a suff
  | '►' `elem` suff = error "► not allowed in UID"
  | null suff       = error "Suffix must be non-zero length"
  | otherwise       = UID $ s ++ '►':suff
    where UID s = a ^. getUID -- pretending the HasUID's Lens is called getUID instead of uid here

Regarding the second solution, I'm not quite sure. It's definitely a good thought. It might be good to discuss it alongside #2599?

Aug 18 '21 16:08 balacij

For constructors that take in one string and use it in multiple places, do we want it to only take a UID or just a String?

-- | Helper for fundamental unit concept chunk creation. Uses the same 'String'
-- for the identifier, term, and definition.
unitCon :: String -> ConceptChunk
unitCon s = dcc (toUID s) (cn' s) s

Aug 18 '21 16:08 Ant13731

Yes, a single unicode character.

@balacij solution with uid is the way to go. Note that we will want many of the constructors of chunks to still take String, so that they can be checked. I'd call derivUID something like qualify or even +++ (to be used infix).

@Ant13731 can you make your side note into an issue? Otherwise it may get lost.

Note that we do not want to export the constructor UID outside of drasil-lang at all. We shouldn't export uid either, except maybe from the .Development package. We really want UID to be treated like an abstract type outside of drasil-lang.

Having a 'typed UID' that carries information about what it points to is definitely an interesting future idea.

Aug 18 '21 19:08 JacquesCarette

For the HasUID class, what should the method be called (since we're using uid as a constructor now)?

Aug 18 '21 19:08 Ant13731

@JacquesCarette I like those names better, I agree. +++ sounds especially good.

@Ant13731 Does it still cause a name clash in the Language.Drasil.Development module? If not, can we use qualified imports so that we also avoid a name clash when importing Language.Drasil and Language.Drasil.Development?

Aug 18 '21 19:08 balacij

Oh okay I see. I can try and see how far I get while using the same for both. I don't think there's any major problems with it inside of drasil-lang, I'll just import the new uid as qualified from a new file.

Aug 18 '21 19:08 Ant13731

Right - I think that uid could be the lens outside of data-drasil, and the smart constructor inside data-drasil. I don't think we will need many explicit calls to the constructor uid outside, if we structure things properly.

Aug 18 '21 20:08 JacquesCarette

I switched the way I was going about doing this halfway through on another branch (newtypeUID2) and I'll probably continue on that one (rather than newtypeUID)

Aug 18 '21 20:08 Ant13731

Should the mkMultiDefn constructor in drasil-theory take in a String or a UID? It's only being used as a helper for other multi definition related constructors but I'm not sure if it'll be used in examples in the future (since we don't really want users to have to interact with the uid constructor manually). Either way is an easy change to make, but I thought I should note it here.

-- | Smart constructor for MultiDefns, does nothing special at the moment. First argument is the 'String' to become a 'UID'.
mkMultiDefn :: String -> QuantityDict -> Sentence -> NE.NonEmpty DefiningExpr -> MultiDefn
mkMultiDefn u q s des
  | length des == dupsRemovedLen = MultiDefn (D.uid u) q s des
  | otherwise                    = error $
    "MultiDefn '" ++ u ++ "' created with non-unique list of expressions"
  where dupsRemovedLen = length $ NE.nub des

-- Should showUID be used here?
-- | Smart constructor for MultiDefns defining UIDs using that of the QuantityDict.
mkMultiDefnForQuant :: QuantityDict -> Sentence -> NE.NonEmpty DefiningExpr -> MultiDefn
mkMultiDefnForQuant q = mkMultiDefn (showUID q) q

Edit: Same with deModel, equationalConstraints, equationalModel, etc.

Aug 19 '21 12:08 Ant13731

Hmm, I'd say it should be a String, representing a suffix however, that's appended to the UID of the QuantityDict given to it.

Regarding the other constructors, I imagine that they could all be String suffixes as well. For the ones that just directly grab an underlying UID, those might need to have some component added to them. For the ones with a custom UID (now), we can change them to be a custom String for a UID suffix appended to whatever it derives from.

Aug 19 '21 14:08 balacij

I had a personal note to create a ticket to continue the new UID system. However, seeing as we still have this ticket open where the majority of the discussion occurs, and that I think the original goal of this ticket was moved around a bit (to the new UID system), I think we should just repurpose this issue for that new UID system (essentially just renaming the title of this ticket + the original post's discussion to direct towards later comments). Would this be okay?

Oct 10 '21 18:10 balacij

Sure.

Oct 11 '21 13:10 JacquesCarette

Drasil Drasil copied to clipboard

Continue building UID system

Drasil
Drasil copied to clipboard