aws-appsync-community icon indicating copy to clipboard operation
aws-appsync-community copied to clipboard

RFC - AWS AppSync Merged APIs

Open ndejaco2 opened this issue 2 years ago • 8 comments

Organizations leverage AWS AppSync in order to create, manage, monitor, and secure serverless GraphQL APIs. With AppSync APIs, teams can access data from multiple different data sources within an AWS account in order to build their GraphQL API.

As AppSync continues to grow in adoption, our customers desire a better way to manage and scale a GraphQL API amongst multiple owning teams and AWS accounts. Teams within an organization want to be able to create, manage, and deploy resources independently in order to separate concerns and increase their development velocity. However, team collaboration on an organization’s AppSync API today has inherent friction due to requiring shared access to IaC repositories and the lack of a way to deploy resources independently. Teams want a way of independently creating/deploying their AppSync resources including the GraphQL schema, resolvers, data sources, and functions while exposing the entire data graph through a single AppSync endpoint.

We’re evaluating implementing the concept of Merged APIs in AppSync which would allow multiple teams to operate independently, contributing to a unified primary GraphQL organizational data interface, and removing bottlenecks from existing manual processes when teams need to share development of a single AppSync API.

MergedAPIs

With Merged APIs, organizations can simply import the resources of multiple development team-owned source AppSync APIs into a single AppSync Merged API endpoint that is exposed to clients. A Merged API is created by specifying the Amazon Resource Name (ARN) of a list of source AppSync APIs. As part of the creation process, AppSync performs a merge of all of the metadata associated with the source APIs including types, datasources, resolvers, and functions.

For simple use cases, where no definitions in the source APIs conflict, there is no need to modify the source API schemas. The resulting Merged API simply imports all types, resolvers, data sources and functions from the original source AppSync APIs.

For complex use cases, new GraphQL directives can be used to provide the flexibility to resolve conflicts:

  • @canonical: if two or more source APIs have the same GraphQL type or field, one of the APIs can annotate their type or field as canonical, which takes precedence when merging the schemas. Conflicting types without this directive in other source APIs are ignored when merged.
  • @hidden: teams might want to remove or hide specific types or operations in the target API so only internal clients can access specific typed data. With this directive attached, types or fields are not merged to the Merged API target.
  • @renamed: There are use cases where different APIs have the same type or field name. However they all need to be available in the merged schema. A simple way to have these types in the Merged API target is by renaming one of them for handling any naming conflicts.
  • @key: This directive allows you to specify a custom primary key for an object type to be used in the advanced merging resolver use case. See the “Merging Resolvers” section for more details.

Consider the following example:

Source1.graphql

type Mutation {
    putPost(id: ID!, title: String!): Post
}

type Post {
    id: ID!
    title: String!
}

type Message {
   id: ID!
   content: String
}

type User @canonical {
   id: ID!
   email: String!
   address: String!
}

type Query {
    singlePost(id: ID!): Post
    getMessage(id: ID!): Message
}

Source2.graphql

type Post @hidden  {
    id: ID!
    title: String!
    internalSecret: String!
}

type Message @renamed(to: "ChatMessage") {
   id: ID!
   chatId: ID!
   from: User!
   to: User!
}

# Stub user so that we can link the canonical definition from Source1
type User {
   id: ID!
}

type Query {
    getPost(id: ID!): Post
    getMessage(id: ID!): Message @renamed(to: "getChatMessage")
}

The merged schema would look like the following:

MergedSchema.graphql

type Mutation {
    putPost(id: ID!, title: String!): Post
}

# Post from Source2 was hidden so only uses the Source1 definition. 
type Post {
    id: ID!
    title: String!
}

# Renamed from Message to resolve the conflict
type ChatMessage {
   id: ID!
   chatId: ID!
   from: User!
   to: User!
}

type Message {
   id: ID!
   content: String
}

# Canonical definition from Source1
type User {
   id: ID!
   email: String!
   address: String!
}

type Query {
    singlePost(id: ID!): Post
    getMessage(id: ID!): Message
    
    # Renamed from getMessage
    getChatMessage(id: ID!): ChatMessage
}

When no directives are added to a conflicting type, the merged type will include the union of all fields from all source definitions of that type. For example, consider the following simple example:

Source1.graphql

type Mutation {
    putPost(id: ID!, title: String!): Post
}

type Post  {
    id: ID!
    title: String!
}

type Query {
    getPost(id: ID!): Post
}

Source2.graphql

type Mutation {
    putReview(id: ID!, postId: ID!, comment: String!): Review
}

type Post  {
    id: ID!
    reviews: [Review]
}

type Review {
   id: ID!
   postId: ID!
   comment: String!
}

type Query {
    getReview(id: ID!): Review
}

The merged graphql schema would become the following:

MergedSchema.graphql

type Mutation {
    putReview(id: ID!, postId: ID!, comment: String!): Review
    putPost(id: ID!, title: String!): Post
}

type Post  {
    id: ID!
    title: String!
    reviews: [Review]
}

type Review {
   id: ID!
   postId: ID!
   comment: String!
}

type Query {
    getPost(id: ID!): Post
    getReview(id: ID!): Review
}

Note that the type Post is defined by the union of all the fields defined in the source GraphQL APIs. When the merged API schema is created, the corresponding resolvers, data sources, and functions are also added to the merged api.

In the above example, consider the case where Source1 has configured a unit resolver on Query.getPost which uses a DynamoDB datasource named PostDatasource configured in Source1. This resolver will return the id and title of a Post type. Now, consider Source2 has configured a pipeline resolver on Post.reviews which executes two functions. Function1 has a None datasource attached to do custom authorization checks in VTL. Function2 has a DynamoDB datasource attached to query the ReviewsTable.

query GetPostQuery {
    getPost(id: "1") {
        id,
        title,
        reviews
    }
}

When the above query is executed by a client to the Merged API endpoint, the AppSync service first executes the unit resolver for Query.getPost from Source1 which calls the PostDatasource and returns the data from DynamoDB. Then, it executes the Post.reviews pipeline resolver including Function1 performing custom authorization logic in VTL and Function2 returning the reviews given the id found in $context.source. The service processes the request as a single GraphQL execution and this simple request should only require a single request token towards the MergedAPI default limit of 2000 request tokens per second.

Now consider the following case where instead of implementing a field resolver in Source2, we also implement a resolver on Query.getPost in order to provide more than 1 field at a time:

Source1.graphql

type Post  {
    id: ID!
    title: String!
    date: AWSDateTime!
}

type Query {
    getPost(id: ID!): Post
}

Source2.graphql

type Post  {
  id: ID!
  content: String!
  contentHash: String! 
  author: String! 
}

type Query {
    getPost(id: ID!): Post
}

MergedSchema.graphql

type Post  {
  id: ID!
  title: String!
  date: AWSDateTime!
  content: String!
  contentHash: String! 
  author: String! 
}

type Query {
   getPost(id: ID!): Post
}

We are considering 2 potential ways of handling this conflict. The first option is to not allow a conflict of resolvers on the same field which would force the source apis to implement a field resolver pattern which would require Source2.graphql to look like this:

type Post  {
  id: ID!
  postInfo: PostInfo
}

type PostInfo {
   content: String!
   contentHash: String!
   author: String!
}

type Query {
    getPost(id: ID!): Post
}

The advantage of this approach is that it keeps the execution model simple and consistent with how AppSync APIs are executed today but the disadvantage is that it may require source API schemas to be rewritten and clients to change the queries that are being ran to accommodate this restriction.

Merging Resolvers

The other option we are evaluating is a concept known as a merging resolver. A merging resolver will allow multiple source resolvers to be attached to the same field and merge the data together to provide a single result. In this example, the AppSync execution would look like this:

  1. Query.getPost resolvers from Source1 and Source2 are executed in parallel.
  2. The child results are merged together using the unique object key (default is a field named “id”).

Source APIs would be able to override the key used in this merged operation using an @key directive allowing them to specify the selection of keys in a type used for merging it.

type Post @key(selectionSet: "{ id version { version }}") {
  id: ID!
  version: Version!
  title: String!
}

type Version {
   version: Int!
}

The advantage of the merging resolver implementation is that it will give the source API teams more flexibility in defining their source schemas and the resolver execution can be parallelized instead of nested. The disadvantage is that the merged resolver will bring more complexity to the execution model and the ownership of the merging resolver would be shared across the multiple source teams.

Source AppSync API Updates

The goal of the Merged API feature is to support team collaboration on a unified AppSync endpoint. Teams can independently evolve their own isolated source GraphQL APIs in the backend and the AppSync service manages the integration of the resources into the single Merged API endpoint for them in order to reduce collaboration friction and increase their velocity.

We plan to support auto-importing changes from the source APIs into the Merged API in order to ensure the Merged API is always up to date. When auto-update is enabled, any update to any of the source APIs will be propagated to the MergedAPI endpoint. If the update in the source API is updating a resolver, datasource, or function, the imported resource will also be updated. If the update in the source API is a schema change, the Merged API will reflect the schema change as long as it does not introduce a new conflict that cannot be auto-resolved.

When a new conflict is introduced that cannot be auto-resolved, both the source API schema update and the MergedAPI schema update are rejected in order to keep the internal endpoint in sync with the merged api endpoint used by clients. When the update is rejected, the source API schema is not updated, the merged API schema is not updated, and the cloudformation deployment will fail with the failure details indicating the merge conflict that was unable to be handled. Also, a call to GetSchemaCreationStatus for the source API will also return the failure details in the response. An example of this is below:

aws appsync get-schema-creation-status —api-id <SourceApiId>
{
  "status": "FAILED", 
  "details": "Failed to update parent merged api schema <MergedApiId>: more than 1 canonical type definition is defined"
}

We will also provide an API for manually updating the merged api to “sync” the changes that have occurred from the source APIs since the Merged API was last updated if a team desires to disable auto-updates.

Merged APIs vs Gateway/Router Based Methods of Combining Schemas

There are many solutions and patterns in the GraphQL community for combining GraphQL schemas and enabling team collaboration through a shared graph. One of the more popular paradigms is to use a frontend gateway/router service which handles proxying requests to backend subgraphs. Some examples of this router based approach include Apollo Federation (https://github.com/apollographql/federation) and Schema Stitching (https://github.com/ardatan/graphql-tools). Merged APIs are a solution that targets the same use case of team collaboration in a single GraphQL endpoint while simplifying the execution model to leverage the AppSync managed GraphQL service for the integration of resources. The following are a comparison of some of the different feature differences between Merged APIs vs GraphQL gateway based architectures:

MergedApisComparison

One key restriction of Merged APIs is that they currently would only support AppSync source APIs, while gateway based architectures are more flexible. As we continue to build and work backwards from customers to tackle this team collaboration problem, we also recognize the need that some customers have for this flexibility. We have released a reference blog for how to add AppSync APIs as a subgraph using a gateway based architecture with Apollo Federation here: https://aws.amazon.com/blogs/mobile/federation-appsync-subgraph/.

Please comment on this thread if you have some thoughts or suggestions on this feature or if you think we’re missing any story points which you would love to see as a part of this feature!

ndejaco2 avatar May 25 '22 00:05 ndejaco2

Whilst I appreciate that this solves a complex problem, without addressing the description issue (https://github.com/aws/aws-appsync-community/issues/38) combining multiple schemas together is going to cause a lot of confusion.

dale-jepto avatar May 26 '22 00:05 dale-jepto

Whilst I appreciate that this solves a complex problem, without addressing the description issue (#38) combining multiple schemas together is going to cause a lot of confusion.

Thanks for your comment! Certainly makes sense that enhanced support for block string would allow the Merged API schema to be more understandable when collaborating across teams. Full spec upgrade is not currently a feature that we have in scope as part of this proposal but we can certainly investigate what it would take to add this specific support as it has no back-compat concerns. I will bring this up in discussions internally.

ndejaco2 avatar May 26 '22 02:05 ndejaco2

It would be great if the Merged API's and Resolvers could be orchestrated through StepFunctions (both Express & Standard Modes) using pure ASL code. This would allow the process to be automated, and then could spin up databases in DynamoDB (or other AWS) that would be derived directly off the Schemas.

ArtificialChatInc avatar May 30 '22 13:05 ArtificialChatInc

It would be great if the Merged API's and Resolvers could be orchestrated through StepFunctions (both Express & Standard Modes) using pure ASL code. This would allow the process to be automated, and then could spin up databases in DynamoDB (or other AWS) that would be derived directly off the Schemas.

Interested to hear more details about how this would work? With MergedAPIs, the AWS resources should already exist in the source APIs and they are simply imported into the new unified endpoint automatically when the MergedAPI endpoint is created/updated. Is there a use case that this would not support?

ndejaco2 avatar May 31 '22 19:05 ndejaco2

Classic

jamesonwilliams avatar Jun 06 '22 19:06 jamesonwilliams

This would be very helpful. Until then we have to use the apollo federated workaround or not use appsync at all.

claydanford avatar Jun 22 '22 13:06 claydanford

Please plan a way for us to test this. I'd love to have some (npm/cli?) tool that can merge schemas so I can run some unit tests, as this conflict auto-resolution can lead to unexpected results in some cases.

stojanovic avatar Jul 11 '22 07:07 stojanovic

@hidden directive will be useful for our use cases where we want to hide some internal/private fields.

cliren avatar Jul 27 '22 21:07 cliren

There hasn't been any update on this in a while, and it's quite a blocker not being able to federate/merge schemas with AppSync. Can anyone provide any further information about this, please?

MartinDoyleUK avatar Mar 01 '23 14:03 MartinDoyleUK

I see the JavaScript resolvers already being added but not sure about this feature, can someone please confirm?

rodrigomata avatar Apr 17 '23 13:04 rodrigomata

The team is hard at work to launch this feature. We can share that we have decided to not support the @key directive at launch. We hope to be able to share more details soon. If you have any specific requests or feedback, please feel free to share here.

ndejaco2 avatar Apr 17 '23 16:04 ndejaco2

We are happy to announce that Merged APIs has been released today for AWS AppSync.

Launch blog: https://aws.amazon.com/blogs/mobile/introducing-merged-apis-on-aws-appsync/ Docs: https://docs.aws.amazon.com/appsync/latest/devguide/merged-api.html Whats new: https://aws.amazon.com/about-aws/whats-new/2023/05/aws-appsync-merged-apis-graphql-federation/

ndejaco2 avatar May 25 '23 21:05 ndejaco2

Hey! Merged APIs works great!

However I have a question about subscriptions. I've noticed, that subscriptions doesn't work for case, where subscriber listens for changes on merged API and the change is done directly on a source API

Is it general approach and recommendation, to do all of the operations directly on merged API?

PatrykMilewski avatar Jul 03 '23 23:07 PatrykMilewski

Hey! Merged APIs works great!

However I have a question about subscriptions. I've noticed, that subscriptions doesn't work for case, where subscriber listens for changes on merged API and the change is done directly on a source API

Is it general approach and recommendation, to do all of the operations directly on merged API?

Yes, all operations that are expected to trigger subscription on the Merged Api endpoint should be done through the Merged API.

ndejaco2 avatar Jul 06 '23 19:07 ndejaco2

The @renamed directive is not quite transparent to the source APIs -- $context.info.parentTypeName contains the renamed type when calling the source API -- which may break any resolvers which depend on that information (in our case we have a single Lambda resolver which uses parentTypeName to choose what code to execute.

Is this the expected behavior, or a missing case that needs to be fixed? My initial expectation would be that the parentTypeName is renamed by the merged API prior to hitting the subgraphs.

You can reproduce this quickly by creating an AppSync API with a "None" data source, a schema of:

type Info {
	parentTypeName: String
}

type Test @renamed(to: "MergedTest") {
	info: Info
}

type Query {
	getTypeName: Test
}

schema {
	query: Query
}

And VTL unit resolvers for Query.getTypeName:

{}

and Test.info:

$util.toJson($context.info)

The response from the underlying AppSync API:

{
  "data": {
    "getTypeName": {
      "info": {
        "parentTypeName": "Test"
      }
    }
  }
}

The response from the merged API:

{
  "data": {
    "getTypeName": {
      "info": {
        "parentTypeName": "MergedTest"
      }
    }
  }
}

rogerchi avatar Jul 12 '23 20:07 rogerchi

Apologies for the late reply on this. Yes, these directives apply to the Merged API schema prior to execution. At execution time, a best practice would be for the resolvers to remain agnostic of the type itself.

ndejaco2 avatar Aug 16 '23 16:08 ndejaco2

@ndejaco2 I have noticed that subscriptions in the source Api's will trigger through the merged Api with the exception of subscriptions triggered by EventBridge. For that I can only get it directly through the source Api that triggers it. Is this a bug or the intended behavior?

sbrickner avatar Apr 21 '24 00:04 sbrickner

This is intended behavior. You will need to have your EventBridge send a mutation to the Merged API rather than the source API if you want to trigger updates through EventBridge.

ndejaco2 avatar Apr 21 '24 01:04 ndejaco2

@ndejaco2 perfect, that is what I needed. Thank you

sbrickner avatar Apr 21 '24 01:04 sbrickner