Duplicate `UID`s with different data
The CI for https://github.com/JacquesCarette/Drasil/pull/4089 originally failed. When the ChunkDB has an insertion on a duplicate UID, it normally does an overwrite. Originally, in said PR, I kept the original and ignored the new one (by accident!). The CI failed because build/ did not match stable/! However, even with duplicate UIDs being inserted, the data should have been the same!
@jackwyand Do you think you can analyze the diff and where the differences are coming from? Cherry-picking https://github.com/JacquesCarette/Drasil/commit/718371bfb2f8fc2cf5a55e6a47ec0a60bd144a96 locally might be helpful.
This is likely to be quite fruitful indeed.
Diffs
There are 3 examples that had a diff in the commit:
- GamePhysics
- DblPend
- Projectile
GamePhysics
Due to no longer overwriting the duplicate UIDs, the reason for the diff in GamePhysics has to do with the Chunks iVect and dVect being added to the ChunkDB. https://github.com/JacquesCarette/Drasil/blob/74b4803c9e91668eb41c696fae15a8fb4ceb6410/code/drasil-example/gamephysics/lib/Drasil/GamePhysics/Unitals.hs#L61-L72
unitalChunks is added to the SymbolMap and TermMap of the ChunkDB, and it contains both iVect and dVect which share a duplicate UID. Since iVect is added first, it is replacing the usage of dVect which changes symbols and the description of the symbol in the SRS.
In the diff, we can see that $\hat{\mathbf{d}}$ is replaced with $\hat{\mathbf{i}}$, along with the description. Description
diff --strip-trailing-cr -r -X ../.gitignore stable/gamephysics/SRS/HTML/GamePhysics_SRS.html build/gamephysics/SRS/HTML/GamePhysics_SRS.html
210,216d209
< <td><em><b>d̂</b></em></td>
< <td>
< Unit vector directed from the center of the large mass to the center of the smaller mass
< </td>
< <td><em>m</em></td>
< </tr>
< <tr>
283a277,281
> <td><em><b>î</b></em></td>
> <td>Horizontal UnitVector</td>
> <td><em>m</em></td>
> </tr>
> <tr>
Instance of usage (the last symbol was changed to $\hat{\mathbf{i}}$)
< \[\symbf{g}=-\frac{G\,M}{\|\symbf{d}\|^{2}}\,\symbf{\hat{d}}\]
---
> \[\symbf{g}=-\frac{G\,M}{\|\symbf{d}\|^{2}}\,\symbf{\hat{i}}\]
DblPend
In DblPend, it looks like this issue was already addressed in #4088 but I will quickly mention it here. The reason for the diff is because of startOriginSingle and startOriginDouble sharing a duplicate UID. https://github.com/JacquesCarette/Drasil/blob/718371bfb2f8fc2cf5a55e6a47ec0a60bd144a96/code/drasil-example/dblpend/lib/Drasil/DblPend/Assumptions.hs#L30-L31
Both of these are added to the ConceptInstances in the ChunkDB, and since the original one is now the one used, it is using the incorrect description (startOriginDescSingle) as seen in the diff.
diff --strip-trailing-cr -r -X ../.gitignore stable/dblpend/SRS/HTML/DblPend_SRS.html build/dblpend/SRS/HTML/DblPend_SRS.html
646c646
< startOrigin: The first rod is attached to the origin.
---
> startOrigin: The pendulum is attached to the origin.
Projectile
Projectile is a little less straightforward for me. I could not find an explicit duplicate UID of the chunk causing the issue, but the changes made in the commit has caused an interesting diff to the "output message." Here is where it is defined in the code: https://github.com/JacquesCarette/Drasil/blob/718371bfb2f8fc2cf5a55e6a47ec0a60bd144a96/code/drasil-example/projectile/lib/Drasil/Projectile/Unitals.hs#L41-L43
Based on the diff, it looks like it is using the UID ("message") rather than the noun phrase that is used in stable/ (S "output message as a string").
diff --strip-trailing-cr -r -X ../.gitignore stable/projectile/SRS/HTML/Projectile_SRS.html build/projectile/SRS/HTML/Projectile_SRS.html
1680c1680
< <li><em>s</em> is the output message as a string (Unitless)</li>
---
> <li><em>s</em> is the message (Unitless)</li>
I am not sure if it is related to the duplicate UIDs, or something deeper? I will continue looking a bit to see if I can uncover anything else.
Projectile (continued)
Looking further into the code, the cause of the diff for projectile comes from the message IdeaDict that is added in doccon. It is added in the TermMap before the correct one, hence why it is resulting in the diff. https://github.com/JacquesCarette/Drasil/blob/718371bfb2f8fc2cf5a55e6a47ec0a60bd144a96/code/drasil-data/lib/Data/Drasil/Concepts/Documentation.hs#L175
https://github.com/JacquesCarette/Drasil/blob/718371bfb2f8fc2cf5a55e6a47ec0a60bd144a96/code/drasil-example/projectile/lib/Drasil/Projectile/Body.hs#L169-L177
In the second code snippet, doccon can be seen being added before unitalIdeas (where the correct message exists) in the second argument (the TermMap).
Thank you for the analysis @jackwyand. For the game physics example, the new code is wrong. The unit vector in the horizontal direction is not the same thing as the unit vector in the direction connecting the two centres of mass. For the double pendulum, I think we can remove the assumptions about the attachment of the rods.
For projectile, changing "is the output message as a string (Unitless)" to "is the message (Unitless)" doesn't seem like that big a deal to me. I'm fine with the new version, although maybe the fact that it changed is an indication of something more concerning going on? :smile:
For GamePhysics: we need to find out what iVect and dVect are supposed to mean. They should definitely have different UIDs. Changing that might help reveal what's what.
For Projectile: the QuantityDict message should be renamed messageString (same with its UID) in Projectile's Unitals.hs. Then the example should be changed to no longer produce an output diff.
In all cases, this was a duplicate UID in error, in that the concepts were different. message is the general concept of a message where messageString is a particular representation of that general concept.
iVect is the unit vector along the $x$ axis in a Cartesian coordinate system. dVect is unit vector in the direction of a vector connecting the centre of mass of two different masses. I agree that they should definitely have different UIDs. They share the definition of unit vectors (length of 1), but they come up in different contexts.
Yep, I think all of these stem from duplicate UIDs so fixing those is the priority.
All of these duplicates have been fixed so no more diffs should occur if the ordering of adding them is changed, hence I think this issue can be closed.
Thank you, @jackwyand !