UA-CloudLibrary icon indicating copy to clipboard operation
UA-CloudLibrary copied to clipboard

Fix failing unit tests

Open barnstee opened this issue 1 year ago • 1 comments

I see a lot of these kind of errors during indexing: "System.InvalidOperationException: The navigation 'ReferenceTypeModel.SuperType' cannot be loaded because one or more of the key or foreign key properties are shadow properties and the entity is not being tracked. Relationships using shadow values can only be loaded for tracked entities."

barnstee avatar Jan 08 '25 07:01 barnstee

Please document how to debug the unit tests locally on a dev box.

barnstee avatar Jan 21 '25 20:01 barnstee

Replicating the Failure

This failure was first observed during execution of Github workflow ".NET" (dotnet.yml).

It can also be replicated on a development workstation that has been configured with the required Postgesql database. (The only thing needed is an accessible connection string. All portions of the database are automatically created when the UC-CloudLibraryServer loads into memory.

From Visual Studio, start the Test Explorer and start the tests running. NodeSets are loaded from data files the TestNamespaces subdirectory. (These are not standard NodeSet files, but instead are JSON files with NodeSet data residing as an embedded XML).

During the process of "indexing" the loaded nodesets, extraneous "attributes" are trimmed and the nodes are fixed up to connect nodes with subtypes. The element of the NodeSetModel which are used for this fixup are "SuperTypes".

The failure occurs reliably in the following situation:

  • A node in one NodeSet (say, http://opcfoundation.org/UA/AML/) has as its subtype a node in another NodeSet (such as http://opcfoundation.org/UA/
  • The file /UACloudLibraryServer/NodeSetIndex/NodeSetModelIndexer.cs is where the action starts, in the function IndexNodeSetsAsync.

The sequence of events is as follows:

  • The helper library CESMII.OpcUa.NodeSetModel.Factory.Opc is called to assemble the nodes within the nodeset. During the last stages of constructing the nodes, there is a fixup performed on the SuperType.
  • There is a call to a function named "RemoveInheritedAttributes", which resides within the helper librarty CESMII.OpcUa.NodeSetModel. That function walks the chain of subtypes in reverse, which is where the term "SuperType" comes from.
  • During this process, the Entity Framework Core library is called.
  • An exception is triggered in the function GetLoadValues, which resides in the source file src/EFCore/Internal/EntityFinder.cs (https://github.com/dotnet/efcore/blob/main/src/EFCore/Internal/EntityFinder.cs)

Possible Causes

Working with Entity Framework requires creating a DbContext object (handle) to access data objects. The SuperType link to an outside node seems to have been loaded using a different DbContext from the one used to load the main NodeSet. When this happens, a "detached" object is observed, which is what is causing the error.

Possible Fixes

If the above possible cause (or something related) is, in fact, occuring, then the code needs to be updated to attach the SuperType node correctly to the main nodeset nodes. The DbContext class has member methods set aside for this express purpose.

PaulLinYao avatar Jul 09 '25 05:07 PaulLinYao

Role of EF Cache

The default behavior is to check the cache first. That makes sense from a performance standpoint, but unfortunately means that inter-namespace node references cause a problem. As such, we need to check for this case, and handle the search for the requested node differently.

PaulLinYao avatar Jul 09 '25 22:07 PaulLinYao

Circular references in AML NodeSet

One of the NodeSet files in my testing data has circular references: http://opcfoundation.org/UA/AML. This file contains 28 nodes, 8 of which error out with the SuperType error message. An example is node (i=5001), InstanceHierarchies. In the references, it has a component in node (i=1005), CAEXFileType.

     <UAObject NodeId="ns=1;i=5001" BrowseName="1:InstanceHierarchies" ParentNodeId="ns=1;i=1005">
            ....
          <Reference ReferenceType="HasComponent" IsForward="false">ns=1;i=1005</Reference>

The CAEXFileType node (i=1005), in turn claims InstanceHierarchies (i=5001) as a component:

     <UAObjectType NodeId="ns=1;i=1005" BrowseName="1:CAEXFileType">
           .....
          <Reference ReferenceType="HasComponent">ns=1;i=5001</Reference>

Next step: investigate other failing NodeSets to see whether there are any circular references that might also be causing this problem.

PaulLinYao avatar Jul 11 '25 04:07 PaulLinYao

Good progress. What I find strange is that the AML package was part of the Cloud Library for a long time and the unit tests only started failing more recently. So perhaps something else changed that all of a sudden made the supertype issue appear? Perhaps a change in the Entity Framework security checks regarding "detached" objects?

barnstee avatar Jul 11 '25 06:07 barnstee

From Copilot:

The error you're encountering in Entity Framework Core:

System.InvalidOperationException: The navigation 'ReferenceTypeModel.SuperType' cannot be loaded because one or more of the key or foreign key properties are shadow properties and the entity is not being tracked. Relationships using shadow values can only be loaded for tracked entities. means that EF Core is trying to load a navigation property (SuperType) on an entity (ReferenceTypeModel) that:

Has a foreign key defined as a shadow property (i.e., not explicitly defined in your class). Is not being tracked by the DbContext, which is required for EF Core to resolve shadow properties. ✅ How to Fix It Here are a few options depending on your use case:

Option 1: Explicitly Define the Foreign Key Make sure the foreign key is explicitly declared in your model:

public class ReferenceTypeModel { public int Id { get; set; }

public int? SuperTypeId { get; set; } // Explicit FK

public ReferenceTypeModel SuperType { get; set; }

}

And configure it in OnModelCreating if needed:

modelBuilder.Entity<ReferenceTypeModel>()     .HasOne(r => r.SuperType)     .WithMany()     .HasForeignKey(r => r.SuperTypeId);

Option 2: Ensure the Entity Is Tracked If you're using .AsNoTracking() or projecting into a DTO, EF won't track the entity. To fix this:

var result = context.ReferenceTypeModels     .Include(r => r.SuperType)     .FirstOrDefault(r => r.Id == someId);

Remove .AsNoTracking() if you need lazy/eager loading. Or use .Include() to eagerly load the navigation:

Option 3: Use Explicit Loading with Tracking If you need to load the navigation later:

var entity = await context.ReferenceTypeModels.FindAsync(id); await context.Entry(entity).Reference(e => e.SuperType).LoadAsync();

You said: was this a recent change in EntityFrameworkCore? This used to work fine...

Yes, this behavior in Entity Framework Core (EF Core) has become stricter in recent versions—particularly from EF Core 7 onward and continuing into EF Core 8.

🔍 What Changed? In earlier versions of EF Core, it was sometimes possible to load navigation properties that relied on shadow foreign keys even when the entity was not being tracked. However, this was never fully reliable and often led to subtle bugs. Starting with EF Core 7 and more explicitly in EF Core 8, the framework enforces that:

Shadow properties (especially foreign keys) require the entity to be tracked in order to resolve navigation properties.

This is now a documented and intentional behavior. If you attempt to load a navigation property that depends on a shadow foreign key while the entity is not tracked (e.g., due to .AsNoTracking() or detached state), EF Core throws the error you're seeing:

The navigation 'X' cannot be loaded because one or more of the key or foreign key properties are shadow properties and the entity is not being tracked.

📌 Why the Change? This change improves consistency and predictability. Shadow properties are stored in EF Core’s Change Tracker, and if an entity isn’t tracked, EF has no way to resolve or populate those values.

✅ What You Can Do To adapt to this change:

Define explicit foreign key properties in your entity classes. Ensure entities are tracked when loading navigation properties that rely on shadow keys. Use .Include() for eager loading or explicit loading with context.Entry(entity).Reference(...).

📚 Reference You can find more details in the official EF Core documentation on shadow properties 1, and in community discussions like this Stack Overflow thread 2.

This explains why it started happening after we switched to .Net 9.

barnstee avatar Jul 11 '25 06:07 barnstee

SuperType bug fixed

The bug that was causing the "SuperType" messages (like the one below) has been fixed.

System.InvalidOperationException: The navigation 'ReferenceTypeModel.SuperType' cannot be loaded

Brief recap on bug context

The problem that gave rise to this issue occurred during the indexing of a NodeSet. As mentioned in an earlier comment, failures were reliably observed when the indexing was done between the http://opcfoundation.org/UA/ ("UA") NodeSet and http:://opcfoundation.org/UA/AML/ ("AML") NodeSet.

The indexing / fixup process

The process of indexing a NodeSet involves loading all relevant nodes into memory: the nodes within the current NodeSet, and the nodes from other NodeSets on which the current NodeSet is dependent.

Fixup between nodes in the same nodeSet

The handling of fixups between nodes in the same NodeSet consists of performing one of two actions:

  • Searching for the target node within nodes that have already been loaded (in the EF cache)
  • For nodes that have not been loaded into memory, creating a reminder to be fixed up later after all nodes are loaded. In the world of Entity Framework, this is done using a proxy (see CreateProxy function in class DbContext).

Fixups between nodes in different nodeSets

To enable fixup between nodes in different NodeSets, neither of the two mechanisms were sufficient. What is needed -- and what was added to address this issue -- was to load the required nodes from the relevant SQL table. The code that enabled this fix is below:

// If we cannot find it in the EF cache, we will query the database
if (nodeModelDb == null)
{
    try
    {
        nodeModelDb = _dbContext.Set<NodeModel>().FirstOrDefault(nm => nm.NodeId == nodeId && nm.NodeSet.ModelUri == nodeSet.ModelUri && nm.NodeSet.PublicationDate == nodeSet.PublicationDate);
    }
    catch (InvalidOperationException)
    {
        // set to notify that we have not found the item in the database
        nodeModelDb = null;
    }
}

The above code can be found here:

  • Dynamic link library: CESMII.Opc.Ua.NodeSetModel.EF.dll
  • -Source file DbOpcUaContext.cs
  • Function GetModelForNode<TNodeModel>(string nodeId)

PaulLinYao avatar Jul 18 '25 05:07 PaulLinYao

Passed: 96 Failed: 9 Total: 105

The "SuperType" bug is fixed. There are still 9 tests that are failing, all of which are related to searches being done on the CloudLib database. The set of NodeSets that are returned for the searches do not match the expected set of NodeSets. Here is an example of one such failure:

  Failed CloudLibClient.Tests.CloudLibSearch.Search(keywords: ["Interface", "BaseObjectType"], expectedCount: 28) [1 s]
  Error Message:
   Assert.Equal() Failure: Collections differ
                                                                                                                                                                                       ↓ (pos 7)
Expected: [···, http://opcfoundation.org/UA/IEC61850-6 02/05/2018 00:00:00 2.0 2770297947 INDEXED, http://opcfoundation.org/UA/IEC61850-7-3 02/05/2018 00:00:00 2.0 778956012 INDEXED, http://opcfoundation.org/UA/MachineTool/ 09/25/2020 00:00:00 1.00.0 671163563 INDEXED, http://opcfoundation.org/UA/MachineVision 07/11/2019 10:18:27 1.0.0 277859304 INDEXED, http://opcfoundation.org/UA/MDIS 10/03/2018 00:00:00 1.20 4086316815 INDEXED, ···]
Actual:   [···, http://opcfoundation.org/UA/IEC61850-6 02/05/2018 00:00:00 2.0 2770297947 INDEXED, http://opcfoundation.org/UA/IEC61850-7-3 02/05/2018 00:00:00 2.0 778956012 INDEXED, http://opcfoundation.org/UA/MDIS 10/03/2018 00:00:00 1.20 4086316815 INDEXED, http://opcfoundation.org/UA/MTConnect/v2/ 06/05/2020 00:00:00 2.00.01 2400873877 INDEXED, http://opcfoundation.org/UA/MachineTool/ 09/25/2020 00:00:00 1.00.0 671163563 INDEXED, ···]

PaulLinYao avatar Jul 18 '25 05:07 PaulLinYao

Good progress!

barnstee avatar Jul 18 '25 13:07 barnstee

> Please document how to debug the unit tests locally on a dev box.

Running a github workflow / action on a dev machine can be done using nektos/ACT. Installation page for this software is here: https://nektosact.com/installation/index.html

nektos/ACT quick installation guide

(1) Make sure you have WSL2 installed (2) Install Docker for Windows (create a Docker account if you don't already have one) (3) Install nektos/act (see link in previous section) (4) Log into docker

              docker login --username [email protected] --password XXXX

(5) From the root folder of the repos, run act like this:

             act.exe -W dotnet.yml --defaultbranch pyao_unittests

where:

  • dotnet.yml - is the name of a YML file in the ./.github/workflows folder
  • pyao_unittests - is the name of the branch on which to run the action

nektos/ACT-Specific YML

There are some differences between how a github runner works in the cloud and on the desktop. To minimize the impact of this, create a global environment flag within your YML file, like this:

env:
  # Flag whether running on Github.com or within nektos/act (on Desktop).
  ONGITHUB: true

Within specific jobs that need to use this flag, set a job-specific flag:

    # Set flag that indicates whether we are running on Github or not.
    - name: 10__Github - Set running on Github flag
      if: github.actor == 'nektos/act'
      run: echo "ONGITHUB=false" >> $GITHUB_ENV

Enabling Access to PostgreSQL

For example, here is how a single YML file can support setting up PostgreSQL on both Github and in your local dev system.

On github, this step runs

    # Run one of these two to enable access to PostgreSQL
    # Postgres Setup step - when running on Github
    - name: 11.1_Start PostgreSQL service on Ubuntu
      if: env.ONGITHUB == 'true'
      run: |
        sudo systemctl start postgresql.service

On dev system, these steps run

    # Postgres Setup step - when running in nektos/ACT on Windows
    - name: 11.2_Postgres installation
      if: env.ONGITHUB == 'false'
      run: |
        sudo apt update  
        sudo apt install -y postgresql postgresql-contrib

    # Postgres Setup step - when running in nektos/ACT on Windows
    - name: 11.3_Postgres start service
      if: env.ONGITHUB == 'false'
      run: |
        sudo service postgresql start
        pg_isready

Numbering YML Steps

Adding numbers in front of the names of runner steps helps debug them. The use of a decimal point helps differentiate between sub-steps that are part of the same, logical step and which are separate from them.

PaulLinYao avatar Jul 18 '25 21:07 PaulLinYao