nest icon indicating copy to clipboard operation
nest copied to clipboard

fix(sample): update gql federation samples to use production-ready

Open mag123c opened this issue 4 months ago • 3 comments

PR Checklist

Please check if your PR fulfills the following requirements:

  • [x] The commit message follows our guidelines: https://github.com/nestjs/nest/blob/master/CONTRIBUTING.md
  • [x] Tests for the changes have been added (for bug fixes / features)
  • [x] Docs have been added / updated (for bug fixes / features)

PR Type

What kind of change does this PR introduce?

  • [x] Bugfix
  • [ ] Feature
  • [ ] Code style update (formatting, local variables)
  • [ ] Refactoring (no functional changes, no api changes)
  • [ ] Build related changes
  • [ ] CI related changes
  • [ ] Other... Please describe:

What is the current behavior?

The GraphQL Federation sample applications (both code-first and schema-first) use IntrospectAndCompose from @apollo/gateway, which is not production-ready. This approach:

  • Performs runtime schema composition at gateway startup
  • Requires all subgraphs to be running when the gateway starts
  • Can cause inconsistencies across multiple gateway instances
  • Is explicitly marked as not suitable for production by Apollo

Issue Number: #14676

What is the new behavior?

The updated samples now demonstrate production-ready patterns:

  1. Static Supergraph Schema: Gateways read from a pre-generated supergraph.graphql file instead of runtime introspection
  2. Schema Generation Scripts: Added npm scripts for generating supergraph schemas: - generate:supergraph: Local development using file-based composition - generate:supergraph:rover: Production using Apollo Rover CLI
  3. TypeScript Type Generation (schema-first): Added generate:typings script for type safety
  4. Automated Build Process: Pre-start hooks automatically generate required schemas
  5. Comprehensive Documentation: Updated READMEs with clear development vs production workflows
  6. Tests: Added E2E tests to verify production-ready implementation

The samples now provide a clear migration path from development to production, following Apollo Federation best practices.

Does this PR introduce a breaking change?

  • [ ] Yes
  • [x] No

mag123c avatar Aug 14 '25 13:08 mag123c

coverage collected failed 😨😨

mag123c avatar Aug 18 '25 08:08 mag123c

Two questions here:

Is the usage of rover CLI mandatory for generating super graph or is it just an alternative to the first generation method? This is not 100% from the text IMO

What would happen if one of the backend applications does not startup? Will the gateway keep crashing or will it simply return 500 for those methods requiring those services? Is the behavior the same even after startup?

dberardo-com avatar Aug 18 '25 10:08 dberardo-com

@dberardo-com

Q1.

No, Rover CLI is NOT mandatory. The implementation provides two methods:

  1. Default method (npm run generate:supergraph)
  • Does NOT require Rover CLI
  • Generates a simplified supergraph for local development
  • Executes generateSupergraphLocal() function
  1. Rover CLI method (npm run generate:supergraph:rover)
  • Recommended for production
  • Composes schema from running subgraphs
  • More accurate supergraph generation

As shown in the code

// generate-supergraph.ts:108-114
const useRover = process.argv.includes('--rover');
if (useRover) {
  generateSupergraph();  // Uses Rover
} else {
  generateSupergraphLocal();  // Local generation (no Rover needed)
}

The README clearly states these are alternatives, with Rover being the production-recommended approach but not mandatory.

Q2.

Current implementation (static supergraphSdl)

  • At startup: Gateway starts successfully regardless of subgraph availability (only reads supergraph.graphql file)
  • At runtime: Queries requiring down services return errors, other services continue working normally
  • Gateway stability: Does NOT crash, remains operational

IntrospectAndCompose behavior (commented out)

  • At startup: Gateway fails to start if subgraphs are unreachable
  • At runtime: With subgraphHealthCheck option, can skip schema updates on failure but continues polling
  • Less resilient: Service downtime directly impacts gateway availability

mag123c avatar Aug 19 '25 00:08 mag123c