openems icon indicating copy to clipboard operation
openems copied to clipboard

fix: Resolve race condition causing incorrect edge list intermittently on Overview page.

Open Jasonlee6789 opened this issue 1 month ago • 10 comments

Description

Problem

An intermittent issue was observed on the Overview page. When navigating back from one edge Live/History page, the Overview sometimes displayed unexpected filtered Edge instead of the full edge list. This bug appeared only occasionally, making it difficult to reproduce consistently.

Root Cause

The issue was caused by a race condition between the parent component's lifecycle and the child component's event emission: I think this race condition caused by ionViewWillEnter() running too late in the lifecycle, giving the child component a chance to apply stale filters before the parent resets its state.

  1. Parent Component (OverViewComponent) The initialization logic (resetting page, filteredEdges, and calling init()) was placed in ionViewWillEnter(). Since Ionic caches views, this hook triggers every time the view becomes active (including navigating back), causing the component to unnecessarily reset its state and reload the list.
  2. Child Component <oe-filter (setSearchParams)="searchOnChange($event)"> Simultaneously, the child component restores its previous state and triggers a filtered search via searchOnChange().

Because both flows run independently: • Sometimes the parent wins → unfiltered list is displayed → correct • Sometimes the child wins → stale filter overwrites parent state → unexpected edge This explains the intermittent nature of the bug and why it is not consistently reproducible.

Solution

Move initial state reset and metadata subscription from ionViewWillEnter() to constructor. The constructor runs only once when the component is instantiated. This ensures the initial state is set deterministically. When navigating back to the Overview, the component simply displays its cached state without re-triggering the initialization logic. This effectively eliminates the race condition, as the parent no longer attempts to reset itself in parallel with the child component.

Jasonlee6789 avatar Nov 21 '25 08:11 Jasonlee6789

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests.

Additional details and impacted files
@@              Coverage Diff              @@
##             develop    #3432      +/-   ##
=============================================
- Coverage      59.78%   59.63%   -0.15%     
  Complexity       112      112              
=============================================
  Files           2870     2894      +24     
  Lines         124042   124658     +616     
  Branches        9298     9343      +45     
=============================================
+ Hits           74152    74324     +172     
- Misses         47097    47524     +427     
- Partials        2793     2810      +17     
:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov[bot] avatar Nov 21 '25 08:11 codecov[bot]

  • Reason for modifying overview.component.spec.ts

I moved the initialization logic from ionViewWillEnter to the constructor to prevent a race condition.

This change causes this.router.navigate() to be called immediately when TestBed.createComponent() instantiates the component. Since the original test setup did not mock the Router, this immediate navigation threw an NG04002 error. 'NG04002: Cannot match any routes' because Router was not mocked. I added a Router spy to the test providers to handle this navigation call gracefully, ensuring the tests pass regardless of when the navigation logic is triggered.

Jasonlee6789 avatar Nov 28 '25 07:11 Jasonlee6789

When I increase the number of virtual Edges to more than 21, the issue shown in the screenshot occurs: returning to the Overview page causes an abnormal behavior. My solution can resolve this. chrome-capture-2025-12-10

Jasonlee6789 avatar Dec 10 '25 06:12 Jasonlee6789

i never had this issue aswell (1500 virtual Edges)

Sn0w3y avatar Dec 10 '25 09:12 Sn0w3y

Based on my extensive testing, I found that the issue on the overview page is highly reproducible with the latest public version of the OpenEMS code. For instance, the bug is easy to trigger in my test environment when simulating 22 virtual edges.

Jasonlee6789 avatar Dec 12 '25 07:12 Jasonlee6789

with the latest public version of the OpenEMS code.

I attempted to reproduce this issue following the steps you described, including:

  • Setting persisted filter values in localStorage
  • Testing with network throttling enabled
  • Using the latest develop branch

Unfortunately, I was unable to reproduce the reported behavior on my end.

However, upon reviewing your video, I noticed that not all filter options are being used. This leads me to believe the issue might be related to the filter restoration logic, which could also be connected to #3443.

Could you provide more details about your specific test environment (browser, Angular/Ionic versions) and the exact filter configuration that triggers this behavior?

Sn0w3y avatar Dec 12 '25 07:12 Sn0w3y

Thank you for looking into this. Regarding your questions: Browser Environment: I have tested and observed this issue on both Microsoft Edge and Google Chrome. The Angular and Ionic versions are consistent with the project’s dependencies defined in the  package.json  of the latest code. Filter Configuration: I haven’t applied any specific filter settings; the bug occurs using the default configuration.

Jasonlee6789 avatar Dec 12 '25 08:12 Jasonlee6789

Thank you for looking into this. Regarding your questions: Browser Environment: I have tested and observed this issue on both Microsoft Edge and Google Chrome. The Angular and Ionic versions are consistent with the project’s dependencies defined in the  package.json  of the latest code. Filter Configuration: I haven’t applied any specific filter settings; the bug occurs using the default configuration.

Thank you.

Since you mentioned the bug occurs with default configuration (no custom filters applied), could you confirm:

  1. Number of edges: You mentioned 22 virtual edges triggers the issue. Does it occur with fewer edges, or is there a threshold?
  2. Backend response time: Is your test backend running locally or remotely? Slower API responses may increase the likelihood of the race condition manifesting.
  3. Reproducibility rate: Approximately how often does the bug occur when following the reproduction steps? (e.g., 1 in 5 attempts, consistently, etc.)
  4. Do you see ALL Filters or only the one in the Video?

Sn0w3y avatar Dec 12 '25 08:12 Sn0w3y

image As shown in the screenshot, when I simulate the default number of 10 edges locally, I cannot reproduce this bug. However, when I increase the count to 22, the issue becomes quite frequent—it typically occurs after navigating to different edges and returning to the overview page just 2 or 3 times. (I can confirm this behavior with backend both running locally and our remote server production environment.)image

Jasonlee6789 avatar Dec 12 '25 08:12 Jasonlee6789

Ahhhhh, makes sense now !

Implementation getPageDevice() Speed
MetadataDummy In-memory HashMap → instant ~microseconds
MetadataOdoo HTTP calls to Odoo database ~10-100ms+

Why the Bug Only Occurs with MetadataDummy

MetadataDummy (line 357-359): return MetadataUtils.getPageDevice(user, this.edges.values(), paginationOptions); → Pure in-memory operation, returns instantly

MetadataOdoo (line 624-628): var result = this.odooHandler.getEdges((MyUser) user, paginationOptions); // ... JSON processing → Real HTTP/database calls with network latency

The Race Condition Mechanics

With MetadataDummy:

  • Both init() and searchOnChange() fire API requests
  • Both complete in microseconds
  • Results interleave unpredictably within the same JS event loop
  • Race is visible

With Odoo:

  • Network latency (~10-100ms) naturally serializes the requests
  • By the time responses arrive, the Angular lifecycle has stabilized
  • Race is hidden by latency

The bug is real but only manifests with MetadataDummy because instant responses create a timing window for the race. Production Odoo backends have sufficient latency to mask the issue. However, the fix is still architecturally correct and prevents potential future issues (e.g., if Odoo responses become faster through caching).

@da-Kai @sfeilmeier

Sn0w3y avatar Dec 12 '25 08:12 Sn0w3y