fix: cross namespace hierarchy traversal from cluster-scoped parents to namespaced children (fixes #24379)
Fixes #24379. Fixes #7733 Fixes #8222
This PR adds resource-tree (including UI) support for cluster-scoped parents owning namespaced children. This is a valid kube owner relationship which is sometimes manifested in projects like Crossplane, but in the current version we don't actually capture the children in the resource tree.
Checklist:
- [x] Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this does not need to be in the release notes.
- [x] The title of the PR states what changed and the related issues number (used for the release note).
- [x] The title of the PR conforms to the Title of the PR
- [x] I've included "Closes [ISSUE #]" or "Fixes [ISSUE #]" in the description to automatically close the associated issue.
- [x] I've updated both the CLI and UI to expose my feature, or I plan to submit a second PR with them.
- [x] Does this PR require documentation updates?
- [x] I've updated documentation as required by this PR.
- [x] I have signed off all my commits as required by DCO
- [x] I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
- [x] My build is green (troubleshooting builds).
- [x] My new feature complies with the feature status guidelines.
- [x] I have added a brief description of why this PR is necessary and/or what this PR solves.
- [ ] Optional. My organization is added to USERS.md.
- [ ] Optional. For bug fixes, I've indicated what older releases this fix should be cherry-picked into (this may or may not happen depending on risk/complexity).
:x: Preview Environment deleted from Bunnyshell
Available commands (reply to this comment):
- :rocket:
/bns:deployto deploy the environment
Codecov Report
:white_check_mark: All modified and coverable lines are covered by tests.
:warning: Please upload report for BASE (master@a2659e9). Learn more about missing BASE report.
:warning: Report is 1 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #24847 +/- ##
=========================================
Coverage ? 62.56%
=========================================
Files ? 353
Lines ? 50087
Branches ? 0
=========================================
Hits ? 31337
Misses ? 15731
Partials ? 3019
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Cross-Namespace Hierarchy Traversal
Overview
This branch adds support for traversing resource hierarchies from cluster-scoped parents to their namespaced children
across multiple namespaces. The implementation uses a parent-to-children index (parentUIDToChildren) that maps
resource UIDs to their direct children, enabling O(1) lookups during hierarchy traversal.
Scope: This feature specifically handles traversing from cluster-scoped parents (e.g., Crossplane Provider) to their namespaced children (e.g., Deployment, Service) across multiple namespaces. Regular within-namespace hierarchy traversal continues to use graph-based traversal.
Implementation
Data Structure
// parentUIDToChildren maps a resource's UID to all its direct children
// Maintained for ALL resources that have owner references
// Used during cross-namespace hierarchy traversal for O(1) lookups
parentUIDToChildren map[types.UID][]kube.ResourceKey
Cache Lifecycle Integration
The index is maintained alongside existing cache structures and follows the same lifecycle:
Incremental Updates (Continuous)
During normal operation, Kubernetes watch events trigger incremental cache updates:
-
setNode(): When a resource is added or updated, the index is updated to reflect new owner references -
onNodeRemoved(): When a resource is deleted, it is removed from all parent entries in the index
Full Sync (Rare)
During cluster invalidation (Invalidate() or initial sync):
- All indexes are cleared, including
parentUIDToChildren - Resources are reloaded from Kubernetes
-
rebuildOrphanedChildrenIndex(): Rebuilds the index by scanning all resources and populating parent-child relationships
Normal Refresh (Frequent, ~3 min)
Standard ArgoCD refresh operations do not touch the cache at all - they only re-run health/sync assessments on existing cached data.
Index Maintenance Functions
addToParentUIDToChildren(parentUID, childKey)
- Adds a child to a parent's entry in the index
- Prevents duplicates
- Called during
setNode()andrebuildOrphanedChildrenIndex()
removeFromParentUIDToChildren(parentUID, childKey)
- Removes a child from a parent's entry
- Cleans up empty entries
- Called during
onNodeRemoved()andupdateParentUIDToChildren()
updateParentUIDToChildren(childKey, oldResource, newResource)
- Handles owner reference changes when a resource is updated
- Compares old and new owner refs
- Removes from old parents, adds to new parents
- Called during
setNode()when updating existing resources
Traversal Algorithm
processCrossNamespaceChildren()
For each cluster-scoped parent:
- Get parent's UID
- Look up direct children via
parentUIDToChildren[parentUID](O(1)) - Process each child with the action callback
- Recursively traverse children via
iterateChildrenUsingIndex()
iterateChildrenUsingIndex(parent, nsNodes, visited, action)
Recursive traversal using the index:
- Look up parent's direct children via
parentUIDToChildren[parent.UID] - For each child in the same namespace:
- Skip if already visited
- Execute action callback
- Recursively process child's descendants
Design Rationale
The index-based approach was chosen over graph-based traversal because:
Graph-based approach (considered but not implemented):
- Would require building temporary parent-child graphs for each affected namespace
- O(n × m) complexity where n = affected namespaces, m = resources per namespace
- Example: 10 affected namespaces × 100 resources = 1,000 operations per traversal
- High memory allocation for temporary graph structures
- Scales with percentage of namespaces containing cross-namespace resources (primary dimension)
Index-based approach (current implementation):
- Maintains a persistent index during normal cache operations
- O(k) complexity where k = resources with cross-namespace parents
- Example: 10 affected resources × O(1) lookup = 10 operations per traversal
- Minimal memory overhead from persistent index
- Scales with total count of cross-namespace resources (namespace distribution irrelevant)
Scaling Characteristics Comparison
The index-based approach fundamentally changes what dimensions drive performance:
| Dimension | Graph-Based | Index-Based | Impact |
|---|---|---|---|
| % of namespaces with cross-NS resources | Primary driver (3.5-64x overhead for 4-100%) | Minimal impact | Index eliminates per-namespace graph building |
| Total cross-NS resource count | Secondary (2.31x for 50 vs 10 resources in 1 namespace) | Primary driver (linear scaling) | O(1) lookup per resource |
| Namespace distribution | Critical (1.02x for 1 namespace, 11.5x for 10 namespaces) | Irrelevant | No per-namespace processing |
| Resources per namespace | Important (graph size = namespace size) | Minimal impact | Only traverse relevant resources |
Key insight: The index-based approach makes the feature's performance predictable based solely on the total number of cross-namespace resources in the cluster, regardless of how they're distributed across namespaces. This is a significant improvement over graph-based approaches where namespace distribution is the primary cost driver.
Performance Characteristics
Benchmark Results
Testing with 50 namespaces, 100 resources per namespace, varying percentages of cross-namespace resources:
| Scenario | Time (μs/op) | Memory (B/op) | Allocs/op |
|---|---|---|---|
| 0% cross-namespace | 5.54 | 6,328 | 105 |
| 2% cross-namespace | 5.67 | 6,328 | 105 |
| 4% cross-namespace | 5.67 | 6,328 | 105 |
| 10% cross-namespace | 6.06 | 6,328 | 105 |
| 20% cross-namespace | 7.16 | 7,648 | 108 |
| 100% cross-namespace | 14.46 | 15,264 | 112 |
Performance Scaling
The implementation scales with the total number of cross-namespace resources, not the percentage of namespaces containing them:
| Cross-namespace % | Time (μs) | Overhead vs 0% | Total Cross-NS Resources |
|---|---|---|---|
| 0% | 5.54 | baseline | 0 |
| 2% | 5.67 | +2.3% | 10 (1 namespace) |
| 4% | 5.67 | +2.3% | 20 (2 namespaces) |
| 10% | 6.06 | +9.4% | 50 (5 namespaces) |
| 20% | 7.16 | +29.2% | 100 (10 namespaces) |
| 100% | 14.46 | +161% | 500 (50 namespaces) |
Key difference from graph-based approaches: The index-based implementation scales with the total count of cross-namespace resources, NOT with the percentage of namespaces containing them. This is because:
- Index lookups are O(1) per resource
- No per-namespace graph building
- Cost is proportional to total resources traversed, regardless of namespace distribution
This means performance is largely independent of namespace distribution - whether 10 cross-namespace resources are concentrated in 1 namespace or spread across 10 namespaces makes minimal difference.
Memory Efficiency
Memory usage remains efficient across workload types:
| Cross-namespace % | Memory (B/op) | Growth vs 0% |
|---|---|---|
| 0-10% | 6,328 | 0% |
| 20% | 7,648 | +21% |
| 100% | 15,264 | +141% |
Memory usage grows modestly because the index is maintained persistently - there's no repeated allocation of temporary graph structures during traversal.
Complexity Analysis
Time Complexity: O(k) where k = number of resources with cross-namespace parents
- Each resource is visited once via O(1) index lookup
- No graph construction needed during traversal
Space Complexity: O(r) where r = total resources in cluster
- Index entry for each resource with owner references
- ~64 bytes per parent entry + ~48 bytes per child ResourceKey
- For 5,000 resources with 10% having owner refs: ~32 KB overhead
Real-World Impact
Typical Crossplane Workload (1-2% cross-namespace)
- Time: ~5.7 μs per traversal
- Impact: Minimal overhead vs baseline (2-3%)
- Use case: Crossplane Providers creating namespaced Deployments and Services. Currently, ArgoCD UI shows only the cluster-scoped CRDs generated by Providers; this feature enables displaying the namespaced Deployments and Services as well.
Multi-Provider Setup (5-10% cross-namespace)
- Time: ~6 μs per traversal
- Impact: 9% overhead vs baseline
- Use case: Multiple Crossplane Providers, each generating namespaced resources across several namespaces
Heavy Cross-Namespace Usage (20% cross-namespace)
- Time: ~7 μs per traversal
- Impact: 29% overhead vs baseline
- Use case: Extensive cross-namespace composition patterns
Extreme Cross-Namespace (100% cross-namespace)
- Time: ~14 μs per traversal
- Impact: 161% overhead vs baseline
- Use case: Stress test - all resources have cluster-scoped parents
All overhead figures are still in the microsecond range and represent acceptable performance for the added functionality.
Operational Characteristics
Memory Overhead
The parentUIDToChildren index adds memory overhead proportional to the number of owner references in the cluster:
Memory per entry = ~64 bytes (map entry) + ~48 bytes per child (ResourceKey)
For a typical cluster with 5,000 resources and 10% having owner refs:
500 resources with owners × ~64 bytes = ~32 KB overhead
This is negligible in the context of ArgoCD's overall memory footprint.
CPU Impact
Index maintenance: Near-zero overhead
- Updates happen during existing
setNode()andonNodeRemoved()operations - Simple map operations (add/remove/lookup) are O(1)
- No additional synchronization needed (protected by existing cache lock)
Traversal: Efficient O(k) traversal
- Direct index lookups replace graph construction
- Sublinear scaling with cross-namespace resource percentage
Scalability
The implementation scales well with cluster size:
Namespace scaling: Performance is independent of total namespace count
- Only processes namespaces with relevant resources
- Index lookup is O(1) regardless of cluster size
Resource scaling: Linear with cross-namespace resources
- Each cross-namespace resource adds one index entry
- Traversal time proportional to number of cross-namespace relationships
Testing
Unit Test Coverage
All cross-namespace hierarchy tests pass:
- ✅
TestIterateHierarchyV2_ClusterScopedParent_FindsAllChildren - ✅
TestIterateHierarchyV2_ClusterScopedParentOnly_InferredUID - ✅
TestIterateHierarchyV2_DisabledClusterScopedParents - ✅
TestOrphanedChildrenCleanup - ✅
TestOrphanedChildrenIndex_OwnerRefLifecycle - ✅
TestIterateHierarchyV2_NoDuplicatesCrossNamespace
Benchmark Scenarios
Benchmarks test various real-world patterns:
- Baseline (0%): Pure namespace-scoped resources
- Low (2%): Typical Crossplane operator workload
- Medium (10%): Multi-tenant with RBAC policies
- High (20%): Heavy cross-namespace composition patterns
- Extreme (100%): Stress test with all resources having cluster-scoped parents
- Variation tests: Different numbers of cross-namespace references per parent (10, 25, 50)
- Scale tests: 50 namespaces vs 100 namespaces
Code Location
File: gitops-engine/pkg/cache/cluster.go
Key sections:
- Lines 312-318:
parentUIDToChildrenfield definition - Lines 689-751: Index maintenance helper functions
- Lines 531-535: Index population in
setNode() - Lines 1779-1784: Index cleanup in
onNodeRemoved() - Lines 562-586: Index rebuild in
rebuildOrphanedChildrenIndex() - Lines 1427-1497:
processCrossNamespaceChildren()using index-based traversal - Lines 1464-1497:
iterateChildrenUsingIndex()recursive traversal
Summary
This implementation adds cross-namespace hierarchy traversal to ArgoCD by:
✅ Maintaining a persistent parent-to-children index for O(1) lookups ✅ Achieving efficient O(k) traversal complexity ✅ Keeping memory overhead minimal (~32 KB for typical clusters) ✅ Integrating cleanly with existing cache lifecycle ✅ Scaling sublinearly with cross-namespace resource percentage ✅ Maintaining microsecond-level performance even under stress scenarios
The index-based approach was chosen over graph-based traversal to avoid O(n×m) complexity and repeated memory allocations during traversal operations.
@jcogilvie can you rebase? That should clear some flaky test failures.