si icon indicating copy to clipboard operation
si copied to clipboard

feat(luminork): Add Luminork Change Set Review Endpoint

Open stack72 opened this issue 1 month ago โ€ข 3 comments

Summary

Adds a new GET endpoint to Luminork that provides a comprehensive, pre-processed review of all changes in a change set. This endpoint aggregates and processes data from multiple materialized views to deliver a clean, ready-to-consume change summary optimized for the Luminork API. ## Endpoint GET /v1/w/{workspace_id}/change-sets/{change_set_id}/review Response:

{
"components": [
    {
    "id": "01FXNV4P...",
    "name": "My EC2 Instance",
    "schemaName": "AWS EC2 Instance",
    "diffStatus": "Modified",
    "attributeDiffTrees": [
        {
        "path": "/domain/Region",
        "diff": {
            "old": { "$source": {...}, "$value": "us-east-1" },
            "new": { "$source": {...}, "$value": "us-west-2" }
        }
        }
    ],
    "actionDiffs": []
    }
]
}

Key Features

๐Ÿ”„ On-Demand Building

  • Not part of incremental MV index - Built only when requested via API
  • Uses SlowRT to avoid blocking async runtime during expensive operations - No caching (data changes frequently with every component
  • modification)

๐ŸŽฏ Smart Data Aggregation

Combines data from multiple sources:

  • ComponentList - Basic component info and metadata
  • ComponentDiff - Attribute-by-attribute differences vs HEAD
  • ActionDiffList - Action changes per component
  • ErasedComponents - Components removed from HEAD

๐Ÿงน Frontend Processing Applied

Replicates frontend logic for clean, useful output:

  • Filters uninteresting diffs:
  • Internal attributes (/si/type, /si/color)
  • Empty schema defaults (empty objects/arrays, null, "")
  • Identical old/new values (can happen on upgrades)
  • Object field placeholders at top levels
  • Corrects diff status:
  • Sets to None if no meaningful changes after filtering
  • Sets to Modified if action diffs exist even without attribute changes
  • Properly handles toDelete + Removed combinations
  • Flattened structure:
  • Returns flat list of { path, diff } pairs
  • Sorted alphabetically for consistent ordering
  • No nested tree complexity

๐Ÿ“Š Component Ordering

Components sorted by status: Added โ†’ Modified โ†’ Removed

Design Decisions

Why not a traditional MV?

  • Review data changes on every component modification
  • Caching would cause stale data
  • On-demand building is more appropriate

Why no ResourceDiff?

  • Would make payload massive for large change sets
  • Clients can fetch per-component via existing endpoints when needed

Why flatten attribute trees?

  • Simpler structure: [{ path, diff }] vs nested trees
  • Easier for clients to consume
  • Full paths already provide hierarchy context

stack72 avatar Nov 24 '25 19:11 stack72

Dependency Review

โœ… No vulnerabilities or OpenSSF Scorecard issues found.

Scanned Files

None

github-actions[bot] avatar Nov 24 '25 19:11 github-actions[bot]

This endpoint aggregates and processes data from multiple materialized views to deliver a clean, ready-to-consume change summary optimized for the Luminork API.

What's the intended use of the API endpoint? It's not clear to me from this description. If this is intended for MCP use, I think a better pattern would be for there to be a "here are the things with differences" list, and a multi-get API endpoint where agents can retrieve multiple individual diffs at once. As @nickgerace pointed out, there are definitely concerns around the size of the overall diff (both in our ability to store them, but also in whether they'll blow through the token limit of any agent for the return size of a tool call).

If this is limited to a minimal overview, then the size concern should no longer be an issue (it would have to be a truly massive amount of changes in a change set to exceed the tool use token limit for responses). If there is a way to multi-get the details for individual components, then agents can dynamically size their queries, and can group things more intelligently into batches (review SSM param changes, then bucket changes, then..., etc).

Was there a different intent behind the endpoint from what I'm thinking here?

jhelwig avatar Nov 24 '25 21:11 jhelwig

This endpoint aggregates and processes data from multiple materialized views to deliver a clean, ready-to-consume change summary optimized for the Luminork API.

What's the intended use of the API endpoint? It's not clear to me from this description. If this is intended for MCP use, I think a better pattern would be for there to be a "here are the things with differences" list, and a multi-get API endpoint where agents can retrieve multiple individual diffs at once. As @nickgerace pointed out, there are definitely concerns around the size of the overall diff (both in our ability to store them, but also in whether they'll blow through the token limit of any agent for the return size of a tool call).

If this is limited to a minimal overview, then the size concern should no longer be an issue (it would have to be a truly massive amount of changes in a change set to exceed the tool use token limit for responses). If there is a way to multi-get the details for individual components, then agents can dynamically size their queries, and can group things more intelligently into batches (review SSM param changes, then bucket changes, then..., etc).

Was there a different intent behind the endpoint from what I'm thinking here?

The diff here is already slimmed down - it doesn't return code diffs, it just returns the components and their AV changes. This isn't just for AI Agent, it's also for the CLI. I ran a diff of a change set with 30 component changes and the output was very readable!

stack72 avatar Nov 24 '25 23:11 stack72