`RunConfig` as architectural boundary proposal
This change contains a design doc describing an architectural boundary at RunConfig level. Aim of this design doc is
- listing the current responsibilities of
RunConfigand related code (e.g.Runner) - propose a new split of responsibilities and a clear golang API for it
- clarify what responsibilities should be moved in the respective user interfaces, e.g. CLI or Operator
Codecov Report
:white_check_mark: All modified and coverable lines are covered by tests.
:white_check_mark: Project coverage is 53.45%. Comparing base (3453db0) to head (6beef0c).
Additional details and impacted files
@@ Coverage Diff @@
## main #2096 +/- ##
==========================================
+ Coverage 53.43% 53.45% +0.01%
==========================================
Files 230 230
Lines 29385 29385
==========================================
+ Hits 15703 15707 +4
+ Misses 12552 12549 -3
+ Partials 1130 1129 -1
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
@claude please review this architectural proposal
Claude finished @JAORMX's task —— View job
Architectural Proposal Review
I've reviewed the RunConfig as architectural boundary proposal. Here's my comprehensive feedback:
Overall Assessment
This is a well-structured and necessary proposal that addresses real architectural concerns in ToolHive. The separation of concerns between RunConfig and the consumer interfaces (CLI, Operator, UI) is architecturally sound and will improve maintainability and extensibility.
Strengths
-
Clear Problem Statement: The proposal effectively articulates the current challenges with configuration handling across different interfaces (CLI, Operator, UI).
-
Practical Examples: The use cases (config reload semantics, storage location differences, export/import) clearly demonstrate why this boundary is needed.
-
Appropriate Scope: Making
RunConfiganalogous toencoding/json(accepting bytes/readers, returning bytes) is the right level of abstraction. -
Current State Documentation: Lines 22-70 provide an excellent inventory of what
RunConfigcurrently contains, split into metadata, workload details, and proxy details.
Areas for Improvement
1. Implementation Guidance Missing
The proposal describes what should change but lacks how to achieve it. Consider adding:
-
Migration Strategy: How do we move from the current state to the proposed state?
- Which fields need to be removed from
RunConfig? - What's the deprecation timeline for file-based parameters?
- How do we handle backward compatibility?
- Which fields need to be removed from
-
API Surface Definition: What should the new public API look like?
// Example: // Instead of exposing RunConfig directly, consider: type RunConfig struct { /* ... */ } func NewRunConfig(opts ...Option) (*RunConfig, error) func (rc *RunConfig) Validate() error func (rc *RunConfig) MarshalJSON() ([]byte, error) func UnmarshalRunConfig(data []byte) (*RunConfig, error)
2. Concrete Examples of Violations
The proposal would benefit from specific examples of current violations:
Current problematic fields in RunConfig (from pkg/runner/config.go:34-162):
EnvFileDir(line 88) - File system referencePermissionProfileNameOrPath(line 79) - File path referenceAuthzConfigPath(line 107) - File path referenceAuditConfigPath(line 113) - File path referenceJWKSAuthTokenFile(line 145) - File path referenceDeployer(line 127) - Runtime-specific dependency
These should be called out explicitly as fields that violate the proposed boundary.
3. Validation Responsibility Unclear
Line 75 mentions "basic validation" as a RunConfig responsibility, but doesn't define what qualifies as "basic":
Suggested clarification:
- Basic validation (RunConfig's responsibility): Structural validity, required fields present, type constraints
- Contextual validation (CLI/Operator responsibility): File existence, network availability, secret references, registry lookups
4. Serialization Format Not Specified
Line 75 mentions serialization but doesn't specify:
- What formats should be supported? (JSON only? YAML? Both?)
- Should the format be versioned? (The struct has
SchemaVersionbut no migration strategy) - What happens when
SchemaVersionchanges?
5. Missing Discussion of Runner Responsibilities
The proposal mentions Runner in the title and problem statement but doesn't clearly define its responsibilities post-refactor. Consider adding:
Proposed Runner Responsibilities:
- Accept a validated
RunConfig - Execute workloads using the provided deployer
- Manage workload lifecycle (start, stop, monitor)
- Handle middleware chain construction
- NOT responsible for: config file reading, secret fetching, registry lookups
6. Impact on Existing Code Not Assessed
The proposal doesn't mention:
- How many call sites need updates?
- What's the blast radius of this change?
- Which packages are affected?
Affected areas (from my analysis):
cmd/thv/app/run.go- CLI configuration buildingcmd/thv-operator/controllers/mcpserver_runconfig.go- Operator CRD mappingpkg/runner/config_builder.go- Builder pattern needs updatingpkg/state/runconfig.go- Storage abstraction
7. "No File-Based Representation" Needs Clarification
Line 77 states: "no file-based representation of configuration parameters is allowed in the RunConfig struct"
Question: Does this mean:
- ❌ No file paths (like
EnvFileDir,AuthzConfigPath)? - ❌ No references to external files at all?
- ✅ File contents should be resolved by CLI/Operator before creating
RunConfig?
This should be explicit with before/after examples:
// ❌ Current (bad)
type RunConfig struct {
AuthzConfigPath string
}
// ✅ Proposed (good)
type RunConfig struct {
AuthzConfig *authz.Config // already resolved
}
8. External API Stability Concern
Line 81 mentions: "Types exposed by RunConfig package should not be used for externally facing formats like HTTP API"
Concern: The current RunConfig struct (pkg/runner/config.go:34-162) has a comment stating it's "part of ToolHive's API contract" (line 36-37). This proposal doesn't address:
- Is
RunConfigJSON format considered a public API? - How do we version it if we need breaking changes?
- What's the compatibility guarantee?
Specific Technical Concerns
1. Storage Abstraction Already Exists
The proposal treats storage as CLI/Operator responsibility, but pkg/state/runconfig.go already provides a storage abstraction layer. How does this fit into the new design?
Recommendation: Clarify whether pkg/state is above or below the boundary.
2. Current Builder Pattern May Need Redesign
pkg/runner/config_builder.go currently handles file resolution during building. With the proposed changes, this logic moves to CLI/Operator, but the builder pattern might still be useful for constructing valid RunConfig instances.
Recommendation: Provide guidance on whether builders should exist and where.
3. Middleware Configuration Complexity
The MiddlewareConfigs field (pkg/runner/config.go:159-161) is complex and touches both proxy configuration and execution environment. The proposal doesn't clarify if/how middleware configuration should be handled.
Missing Sections
Consider adding:
-
Non-Goals: What are we explicitly NOT trying to solve?
- Are we keeping backward compatibility for existing serialized configs?
- Are we refactoring the Operator CRD structure?
-
Success Criteria: How do we know when this is complete?
- All file path references removed from
RunConfig? - CLI and Operator can construct
RunConfigwithout accessing filesystem duringRunner.Run()? - Configuration reload is handled at CLI/Operator level?
- All file path references removed from
-
Testing Strategy: How do we validate the boundary is maintained?
- Linter rules to prevent file paths in
RunConfig? - Interface contracts to enforce separation?
- Linter rules to prevent file paths in
-
Timeline: When do we want this implemented?
Recommendations
Priority 1 (Critical for proposal acceptance):
- Add concrete examples of fields that violate the current boundary
- Define what "basic validation" means
- Clarify the "no file-based representation" rule with examples
- Add migration strategy and backward compatibility approach
Priority 2 (Important for implementation):
- Define the new public API surface
- Clarify
Runnerresponsibilities post-refactor - Specify serialization format and versioning strategy
- Assess impact on existing code
Priority 3 (Nice to have):
- Add non-goals and success criteria
- Include testing strategy
- Define timeline
Conclusion
This proposal identifies a legitimate architectural issue and proposes a sensible solution. However, it needs more implementation detail to be actionable. The core idea of treating RunConfig as a pure data structure (like encoding/json) is sound, but the transition path from current state to desired state needs to be mapped out.
Recommendation: Expand the "Responsibility Split" section significantly before implementation begins, or create a follow-up implementation plan document that addresses the gaps identified above.
References
- Current
RunConfigimplementation:pkg/runner/config.go:34-162 - Current storage abstraction:
pkg/state/runconfig.go - Operator usage:
cmd/thv-operator/controllers/mcpserver_runconfig.go - CLI usage:
cmd/thv/app/run.go
An open question I'd like to discuss tomorrow is: Should we have information about the secret provider that's used and a key to retrieve the secret at runtime? This would make it easier to share configurations.
@JAORMX currently, RunConfig contains "how to identify the secret", but not "where to get it from". Additionally, it is only possible to configure a single secrets provider at a time. We're fine keeping this as is for now.
@blkt Was there anything on this we needed to do to get it ready for review?
Yes, I need to get rid of obsolete fields, which should be either tagged or just absent in the RunConfig struct.