azure-container-networking icon indicating copy to clipboard operation
azure-container-networking copied to clipboard

Fix logging response from NMAgent in syncHostNCVersion function

Open Copilot opened this issue 4 months ago • 0 comments

Problem

The syncHostNCVersion function logged a generic error message when some NCs couldn't be updated:

if len(outdatedNCs) > 0 {
    return len(programmedNCs), errors.Errorf("unabled to update some NCs: %v, missing or bad response from NMA", outdatedNCs)
}

This message was not useful because it didn't distinguish between:

  1. NCs that are completely missing from the NMAgent response
  2. NCs that are present in the NMAgent response but programmed to older versions

Solution

Enhanced the error logging to separately track and report missing vs outdated NCs with detailed version information:

  • Missing NCs: Shows NC IDs and their expected versions for NCs completely absent from NMAgent response
  • Outdated NCs: Shows NC IDs with both expected and actual versions for NCs present but outdated in NMAgent response

Changes

Core Implementation (cns/restserver/internalapi.go)

  • Added separate tracking maps during NC processing:
    • missingNCs: Maps NC ID → expected version
    • outdatedNMaNCs: Maps NC ID → "expected:X,actual:Y"
  • Enhanced processing logic to categorize NCs correctly based on NMAgent response
  • Replaced generic error with structured message showing both categories with version details

Test Coverage (cns/restserver/internalapi_test.go)

  • Added comprehensive test TestSyncHostNCVersionErrorMessages covering both scenarios
  • Validates error message content and programmed NC count behavior
  • Ensures existing functionality remains unchanged

Example Output

Before:

"unabled to update some NCs: [nc-id-1 nc-id-2], missing or bad response from NMA"

After:

Missing only: "missing NCs from NMAgent response: map[nc-id-1:2]"
Outdated only: "outdated NCs in NMAgent response: map[nc-id-1:expected:2,actual:1]"
Combined: "unable to update some NCs - missing NCs from NMAgent response: map[nc-id-1:2]; outdated NCs in NMAgent response: map[nc-id-2:expected:3,actual:1]"

This provides operators with actionable information to distinguish between missing NCs (potential NMAgent issues) and outdated NCs (version synchronization issues), along with specific version details for effective troubleshooting.

Fixes #3746.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot avatar Jun 19 '25 17:06 Copilot