Drasil
Drasil copied to clipboard
Continue building UID system
This issue will probably take a good chunk of time to complete, but I think we will want to start getting rid of functions that can overwrite information within a given UID. I originally set out to try and fill in more fields of the database automatically (#2775), but the concepts, quantities, and qdefinitions and modelkinds fields have the potential to give conflicting information to the same UID, which could result in some weird behaviour.
mkQuantDef'
is defined as follows:
-- | Used to help make 'QDefinition's when 'UID' and 'Symbol' come from the same source, with the term separate.
mkQuantDef' :: (Quantity c, MayHaveUnit c) => c -> NP -> Expr -> QDefinition
mkQuantDef' c t = mkQDefSt (c ^. uid) t EmptyS (symbol c) (c ^. typ) (getUnit c)
Which gives it the ability to overwrite a lower-levelled chunk's NP
when I don't think it should. It's used most often to make general chunks more precise (ex. The pendulum examples use a general acceleration
chunk and replace its general term with one that makes sense in the context of a pendulum). Modifying the data assigned to a UID without modifying the UID itself could then result in unexpected terms popping up. To follow up the pendulum example, a user can no longer use the generalized definition of acceleration because that information has been lost and replaced with the specific one.
I think this sort of problem is very similar to #1116 and #91. There are also some similar constructors defined in drasil-theory
for ModelKind
(see #2775)
Edit: This issue can be split up into the following:
- [ ] Change UID to a newtype
- [ ] Change format of constructors - some should take UID, some String, and some should append the ► symbol
- [ ] Implement the UID checker function
Which gives it the ability to overwrite a lower-levelled chunk's NP when I don't think it should.
This is very relevant to #2388. It makes certain chunks unreachable (overwritten in the UID maps).
Instead of getting rid of mkQuantDef'
and similar functions, would it suffice, in general, to, when inheriting a UID, transform the originating UID in some way? For example, with mkQuantDef'
, we could just append "QD" to the end of the UID. This would likely require us to go in and edit the existing code to make sure that our references are pointing to the right things. The UIDs are never really shown to end-users, so I'm not quite sure if we care about them being pretty.
We indeed to not care about UIDs being pretty. Right now they are short, but I always thought they might 'morph' into something much more structured than just a String
, as the need arose.
I would probably prefer to use something better than just appending QD
. Too dangerous. Maybe use a custom append function (for uids) that also inserts some obscure unicode character as a 'link' that isn't likely to be used by people (at least not by people who aren't explicitly trying to break things)?
Right now they are short, but I always thought they might 'morph' into something much more structured than just a String, as the need arose.
I can definitely see something else potentially needed/wanted. Something else might also show a clear development trail scheme for some things.
I would probably prefer to use something better than just appending QD. Too dangerous. Maybe use a custom append function (for uids) that also inserts some obscure unicode character as a 'link' that isn't likely to be used by people (at least not by people who aren't explicitly trying to break things)?
Sounds good to me. We could then enforce that the character also isn't in the String (UID) already when creating a new UID, and that it's not in an appended String when creating a derivative UID. Perhaps a right arrow, ►
, or a triple nested-greater than, ⫸
?
@Ant13731, what do you think?
Since one might want to append more than once, we can't check if that character isn't there. We could do something more clever if we changed UID
to be a newtype
, then we could trust things that are already UID
but not raw strings. I have no strong feelings on which character to use.
Yep, sounds good to me. Should each ModelKind
function get its own unicode character as well?
Edit: So would the first step be to make UID
a newtype
?
Edit 2: Since we're changing UID
from a String
to a String
with a constructor, the type signatures of all the chunk constructors will change. Should these take in a String
then instead of a UID
?
I think that just 1 special character is enough, meaning that whatever it's tagged to was derived from some chain of things. For example, a►b►c
would mean that c
was derived from a►b
, and that b
was derived from a
.
I believe the constructors should be taking a UID
still.
Ahh okay, I see.
I believe the constructors should be taking a UID still.
If UID
turns into a newtype
, would that require the user to give an input of UID "actualUID"
instead of just their "actualUID"
? So making a chunk would look like this:
myNewChunk :: NamedChunk
myNewChunk = nc (UID "chunkUID") chunkTerm
rather than this:
myNewChunk :: NamedChunk
myNewChunk = nc "chunkUID" chunkTerm
I was thinking that if we take something that will eventually be a UID
as a String
first, then we know that we have to run it through something to check that there are no invalid characters (like ►) when the user initially gives their UID
to the database. And then constructors that build on top of other chunks won't need a string since the UID
will be passed through the chunk. So something like the NamedChunk smart constructor would look like:
nc :: String -> NP -> NamedChunk
nc s = NC (UID uidCheckerFunction s)
while the IdeaDict wrapper function would remain the same (with no function to check the UID):
nw :: Idea c => c -> IdeaDict
nw c = IdeaDict (NC (c^.uid) (c^.term)) (getA c)
But I'm not really sure what the checking function would look like, or if that would even work.
Or, thinking of #2775, what if we give the UID
type something that takes in a list of its 'chunk types'. If we can do something along those lines, Drasil could then know its own type structure. This implementation wouldn't work exactly, but something along the lines of:
data UID = UID {
_getUID :: String,
chunkType :: [ChunkType]
}
data ChunkType = NamedIdea | IdeaDict | CI | ConceptChunk | QuantityDict | ...
Hmm on second thought, I don't think it would be that great of a change, but I'll still leave the idea here anyways.
As a side note, while I was fixing alll of the UID
things to work with newtype
, I came across this:
performance = dcc "performance" (nounPhraseSP "performance")
"the action or process of carrying out or accomplishing an action, task, or function"
performanceSpd = dcc (uidToStr $ performance ^. uid) (nounPhrase'' (phrase performance) (S "speed") CapFirst CapWords)
"the action or process of carrying out or accomplishing an action, task, or function quickly"
Just leaving it here as a note since performanceSpd
looks a little off, especially with the reuse of uids and completely different phrase and plural forms.
If UID turns into a newtype, would that require the user to give an input of UID "actualUID" instead of just their "actualUID"?
Yep, though, it would probably best to use smart constructors to ensure length + not containing bad character.
I can see it being (effectively, but there's probably a better way of encoding this):
newtype UID = UID String
uid :: String -> UID
uid s
| '►' `elem` s = error "► not allowed in UID"
| null s = error "UID must be non-zero length"
| otherwise = UID s
derivUID :: HasUID a => a -> String -> UID
derivUID a suff
| '►' `elem` suff = error "► not allowed in UID"
| null suff = error "Suffix must be non-zero length"
| otherwise = UID $ s ++ '►':suff
where UID s = a ^. getUID -- pretending the HasUID's Lens is called getUID instead of uid here
Regarding the second solution, I'm not quite sure. It's definitely a good thought. It might be good to discuss it alongside #2599?
For constructors that take in one string and use it in multiple places, do we want it to only take a UID or just a String?
-- | Helper for fundamental unit concept chunk creation. Uses the same 'String'
-- for the identifier, term, and definition.
unitCon :: String -> ConceptChunk
unitCon s = dcc (toUID s) (cn' s) s
Yes, a single unicode character.
@balacij solution with uid
is the way to go. Note that we will want many of the constructors of chunks to still take String
, so that they can be checked. I'd call derivUID
something like qualify
or even +++
(to be used infix).
@Ant13731 can you make your side note into an issue? Otherwise it may get lost.
Note that we do not want to export the constructor UID
outside of drasil-lang
at all. We shouldn't export uid
either, except maybe from the .Development
package. We really want UID
to be treated like an abstract type outside of drasil-lang
.
Having a 'typed UID' that carries information about what it points to is definitely an interesting future idea.
For the HasUID
class, what should the method be called (since we're using uid
as a constructor now)?
@JacquesCarette I like those names better, I agree. +++
sounds especially good.
@Ant13731 Does it still cause a name clash in the Language.Drasil.Development
module? If not, can we use qualified imports so that we also avoid a name clash when importing Language.Drasil
and Language.Drasil.Development
?
Oh okay I see. I can try and see how far I get while using the same for both. I don't think there's any major problems with it inside of drasil-lang
, I'll just import the new uid
as qualified from a new file.
Right - I think that uid
could be the lens outside of data-drasil
, and the smart constructor inside data-drasil
. I don't think we will need many explicit calls to the constructor uid
outside, if we structure things properly.
I switched the way I was going about doing this halfway through on another branch (newtypeUID2
) and I'll probably continue on that one (rather than newtypeUID
)
Should the mkMultiDefn
constructor in drasil-theory
take in a String
or a UID
? It's only being used as a helper for other multi definition related constructors but I'm not sure if it'll be used in examples in the future (since we don't really want users to have to interact with the uid
constructor manually). Either way is an easy change to make, but I thought I should note it here.
-- | Smart constructor for MultiDefns, does nothing special at the moment. First argument is the 'String' to become a 'UID'.
mkMultiDefn :: String -> QuantityDict -> Sentence -> NE.NonEmpty DefiningExpr -> MultiDefn
mkMultiDefn u q s des
| length des == dupsRemovedLen = MultiDefn (D.uid u) q s des
| otherwise = error $
"MultiDefn '" ++ u ++ "' created with non-unique list of expressions"
where dupsRemovedLen = length $ NE.nub des
-- Should showUID be used here?
-- | Smart constructor for MultiDefns defining UIDs using that of the QuantityDict.
mkMultiDefnForQuant :: QuantityDict -> Sentence -> NE.NonEmpty DefiningExpr -> MultiDefn
mkMultiDefnForQuant q = mkMultiDefn (showUID q) q
Edit: Same with deModel
, equationalConstraints
, equationalModel
, etc.
Hmm, I'd say it should be a String
, representing a suffix however, that's appended to the UID of the QuantityDict
given to it.
Regarding the other constructors, I imagine that they could all be String suffixes as well. For the ones that just directly grab an underlying UID, those might need to have some component added to them. For the ones with a custom UID (now), we can change them to be a custom String for a UID suffix appended to whatever it derives from.
I had a personal note to create a ticket to continue the new UID system. However, seeing as we still have this ticket open where the majority of the discussion occurs, and that I think the original goal of this ticket was moved around a bit (to the new UID system), I think we should just repurpose this issue for that new UID system (essentially just renaming the title of this ticket + the original post's discussion to direct towards later comments). Would this be okay?
Sure.