simphony-osp icon indicating copy to clipboard operation
simphony-osp copied to clipboard

Duplicated objects

Open pablo-de-andres opened this issue 4 years ago • 4 comments

In GitLab by @abahde on Jun 9, 2020, 14:50

Let's say a user wants to link two CUDS objects. The user creates both objects on his machine and sets a relationship between them. What happens if one of these objects is already present in the dataspace (in the sense that there is already an object in the dataspace that has the same meaning)? The user should actually link his object to the one from the dataspace and should not create a new one. An example:

A tensile test is performed by a tensile test machine (e.g. Machine No. 3 in the IWM). The two objects are of oclass TENSILE_TEST and TENSILE_TEST_MACHINE. But if the tensile test machine already exists in the dataspace, the user should use the TENSILE_TEST_MACHINE object from the dataspace instead of creating a new one. Otherwise the semantic link is missed and we end up with two TENSILE_TEST_MACHINE objects in the dataspace that are actually the same machine in the real world.

In my opinion this is a general problem that users should be aware of. The question is still, whether we want to support some functionalities that helps the user here. A few ideas:

  • We provide a way to mark attributes as unique. Every object in the dataspace that has the same oclass and the same unique attribute value are duplicates. In the tensile test example, every TENSILE_TEST_MACHINE object with an attribute label='machine no 3 is a duplicate and we raise a warning. This idea still does not catch the full problem: label='machine no three will not be a duplicate

  • We make a whole documentation story out of that and make sure that users understand the issue. We could provide a utility function that searches the space for duplicates based on a criterion. It then returns either the found object or None if no duplicate was found.

pablo-de-andres avatar Jun 22 '20 14:06 pablo-de-andres

In GitLab by @urbanmatthias on Jun 10, 2020, 14:27

Two more things to consider:

  • In owl there is the HasKey keyword, that states that two different individuals cannot have the same value for the key. If you have two objects with the same key value, the reasoner would probably infer that these two objects are identical.
  • In the setting of CUDS objects we always assumed that two CUDS objects with a different UUID are different. So if we would tell the reasoner that the UUID is a key, and label is a key, then it would probably crash if there are two objects with same label and different uuid.

pablo-de-andres avatar Jun 22 '20 14:06 pablo-de-andres

In GitLab by @urbanmatthias on Jun 10, 2020, 14:28

Maybe it will not crash but tell the user that the data is inconsistent

pablo-de-andres avatar Jun 22 '20 14:06 pablo-de-andres

So the solution here is a reasoner?

aaronAB1993 avatar Jul 30 '20 06:07 aaronAB1993

Could be...

urbanmatthias avatar Aug 03 '20 08:08 urbanmatthias