How to on delete cascade?
Which Category is your question related to? API, database
I have one @model type that utilizing several many-to-many connections via a joining model as, for instance, described here.
Now, when I delete a record that belongs to that @model (e.g., Post in the linked example), I have to "manually" delete all records in the joining model. It is very cumbersome as there no way to batch delete them (at least as far I know).
I wonder if there is a way to make my life easier :-) What would be a recommended way to handle it?
Hi @sakhmedbayev
Currently we don't support cascading delete.
Hi @sakhmedbayev
Currently we don't support cascading delete.
Why not? :-) I think it would be a great feature to add to amplify stack
Hi @sakhmedbayev
I have added this as enhancement, once this is prioritized we will work on this. Feel free to 👍 so this gets more visibility. Also you can open a PR for your use case and discuss with the team.
You can use on-delete subscriptions in front-end. And do things.
Cascade delete would be much better implemented with a DynamoDB stream lambda. If there is not a new item in the event, you know it has been deleted and can delete related models. Client-side would not be robust.
An automated back-end solution is better yes.
Any modern database has cascade delete, so having one should be a priority.
This would indeed be cool to have. I ended up doing this myself with a Lambda subscribed to a DynamoDB stream, where I semi-replicate the relationships between records (the ones defined in the schema with @connection), so that whenever a record gets deleted, all the connected records which should be removed go with it. The Lambda interacts with DynamoDB directly, which makes it really fast. The only painful part about it is that I have to update that Lambda every time these relationships change in the schema.
@dragosiordachioaia if that's something you could share, I should would appreciate seeing what that looks like.
I'm doing this client-side at the moment. I just worry that - given enough time - someone will be half-way through cleaning up a n:n relation and they will lose connectivity. I know that this will then cause No nulls errors when I then query them via graphql
Like @dragosiordachioaia I ended using a Lambda for this. It works pretty well.
Here's how I've recently set it up for us.
We have a Company type, that has several many-to-many relationships to various entities. For example, each company may have many attorneys. AttorneyAssignment is the bridge between Company and Attorney.
I've created a Lambda called dynamoTrigger. It’s configured to respond to changes in several Dynamo tables, including Company.
In dynamoTrigger/src/index.js:
const records = event.Records.map((record) => ({
new: AWS.DynamoDB.Converter.unmarshall(record.dynamodb.NewImage),
old: AWS.DynamoDB.Converter.unmarshall(record.dynamodb.OldImage),
}));
await cleanupConnections(records);
Elsewhere, cleanupConnections is defined as:
async function cleanupConnections(records) {
const companyIDs = records
.filter((record) => record.old.__typename === 'Company')
.filter((record) => !record.new.id)
.map((record) => record.old)
.map((company) => company.id);
await Promise.all([
// For each company, find all their remaining connections and delete them.
...companyIDs.map(async (companyID) => {
let companyResponse;
try {
companyResponse = await gqlListCompanyConnections(companyID);
} catch (e) { console.log(e); }
if (companyResponse) {
const attorneys = companyResponse.listAttorneyAssignments.items || [];
const paralegals = companyResponse.listParalegalAssignments.items || [];
await Promise.all([
...attorneys.map(async (employer) => gqlDeleteItem(
'AttorneyAssignment', employer.id,
)),
...paralegals.map(async (employer) => gqlDeleteItem(
'ParalegalAssignment', employer.id,
)),
]);
}
return true;
}),
]);
}
This relies on a few custom queries.
First, gqlListCompanyConnections finds all the related records for the Company that is being deleted.
const query = /* GraphQL */ `
query ListCompanyConnections($companyID: ID = "") {
listAttorneyAssignments(filter: { companyID: { eq: $companyID } }) {
items {
id
}
}
listParalegalAssignments(filter: { companyID: { eq: $companyID } }) {
items {
id
}
}
}
`;
Then I have a helper function to delete an item of any type
async function gqlDeleteItem(type, id) {
const query = /* GraphQL */ `
mutation DeleteItem($id: ID = "") {
delete${type}(input: { id: $id }) {
id
}
}
`;
const variables = {
id,
};
const operationName = 'DeleteItem';
const rs = await callGraphQL(query, operationName, variables);
return rs;
}
As @dragosiordachioaia notes, it’s a pain to stay on top of changing relationships. The really painful part is that there's currently no way to use the CLI to change which models should Trigger a Lambda. They can only be set at the creation of a Lambda.* When I've needed to add a new model, I've resorted to the following steps:
- Copy my Lambda’s
srcto a temp directory - Delete my Lambda
- Re-create my Lambda with the CLI, carefully selecting all the old configuration and including my new model as one of the triggers
- Restore the
srcI'd stashed in Step 1..
- In theory one could manually edit the
*-cloudformation-template.json, but that seems error-prone and likely to be overwritten.
@lseemann do you think it would be advisable to just make a lambda resolver that does this?
Possibly? I confess that resolvers are a part of Amplify I'm not yet adept in, but I'd love to know more about what you have in mind. All I know is that doing it in the client should probably be avoided, for the reason you describe but also because what happens if a record somehow gets deleted outside of the client, such as through the graphQL browser or even directly in Dynamo?
I'm working on this atm and going with the DynamoDB stream trigger as per your example. Quick question: are you also using a trigger to create your AttorneyAssignments upon the "INSERT" event?
Haha, I spent hours trying to use a DynamoDB lambda to clean up the edge connections only to realize that the lambda is triggered after the deletion event and - therefore - I cannot query the edge connections by that ID any more :sweat_smile:
I ended up just using graphql aliases to do this, like so:
const linkDelete = async ({ id }, authMode?: GRAPHQL_AUTH_MODE) => {
const { data: { getEdge } } = await API.graphql({
query: queries.getEdge,
variables: { id },
authMode,
})
if (!getEdge) {
console.warn("No Edge found with this id:", id)
return
}
const { nodes: { items } } = getEdge
if (!items.length) {
console.warn("No items found for this Edge:", id)
return
}
const [ from, to ] = items.map(({ id }) => id)
const mutation = /* GraphQL */ `
mutation {
edge: deleteEdge(input: {
id: "${id}"
}) { id }
edgeNodeFrom: deleteEdgeNode(input: {
id: "${from}"
}) { id }
edgeNodeTo: deleteEdgeNode(input: {
id: "${to}"
}) { id }
}
`
const results = await CRUD({
query: mutation,
variables: {},
authMode,
})
return results
}
Haha, I spent hours trying to use a DynamoDB lambda to clean up the edge connections only to realize that the lambda is triggered after the deletion event and - therefore - I cannot query the edge connections by that ID any more 😅
Oh, man, I should have pointed that out. I think I lost the same hours before I made the same realization.
I think your graphQL chops are a little beyond mind, but I like your thinking. Thanks for sharing. Am going to study it a bit more to understand it better.
In my example, since my nodes don't exist any more, I'm using ListCompanyConnections to find the vestigial edges. It looks like you're doing the same thing, but in a less manual fashion?
I'm working on this atm and going with the DynamoDB stream trigger as per your example. Quick question: are you also using a trigger to create your
AttorneyAssignmentsupon the"INSERT"event?
No, they're being created manually. A Company and an Attorney are created independently, and then a graphQL mutation in the client creates an AttorneyAssignment to link them as needed.
Gotcha, yep, I think it accomplishes the same thing as the concern was a half-fulfilled mutation, but with the aliases, the request is sent in the same AppSync API call (and is thus all handled server side). There may be issues on the server side, but I'm going to overlook those and hope that AWS doesn't fail me.
btw, if you have an unknown number of connections, you can extrapolate the example above by concatenating template strings that have an alias incremented by the index. E.g., instead of edgeNodeFrom:, use something like edgeNode${index}:
I needed cascade deletion in a project of mine and initially set out with the DynamoDB Lambda Trigger approach. But I had some concerns which ultimately led to my choosing a different method:
- The Lambda trigger is invoked on all update events to the table, not just on
REMOVEevents. That's a lot of extra, unnecessary invocations. - Scoping the GraphQL requests in the Lambda trigger to respect
@authmodel directives without direct access to the Cognito user calling thedeletemutation seemed super difficult. Just using an all-access Lambda IAM Execution role is very heavy-handed in my opinion and could lead to security issues with more complicated relationships / auth rules. - Most importantly for me–I'm using
rtk-queryto consolidate and cache Amplify API requests. This requires being able to correctly invalidate cached requests for affected entries. The Lambda DynamoDB trigger approach doesn't allow returning back to the client the cascade-deleted entries––in fact, it doesn't allow returning back to the client anything. So to invalidate the cache I'd have to predict the cascade-deleted entries with a deeply nested query beforehand. No good.
So instead of using a Lambda trigger based on DynamoDB updates, I decided to just run the cascade deletion from a serverless express Lambda function behind an endpoint in a REST API with authorization based on the Cognito user pool configured for my AWS Amplify app. By accessing the Authorization header in the request that invokes the Lambda function connected to the endpoint, I have the user's ID JWT token and can funnel it through to GraphQL requests in the Lambda to assume that user's identity when running queries and mutations.
This solves pretty much all of the problems I had above:
- We now only begin the cascade deletion when a user explicitly makes a request to the REST API. So we have very clear and limited invocation conditions for the Lambda trigger.
- As the GraphQL requests are signed as if they were the authenticated Cognito user who invoked the Lambda function, we don't have to worry about complex IAM roles on the Lambda function.
- The response takes a bit longer than a normal
deleteon the GraphQL API, but we now prevent any race conditions that could potentially re-query incomplete data in the middle of the cascade deletion. Furthermore, the Lambda function returns arrays of the ids of all affected entries, so invalidating the cache is trivial.
Implementation of the above is fairly straightforward and within the bounds of documented AWS Amplify, except for configuring the REST API's Cognito authorizer, which requires some CDK overriding. I wrote a couple of posts outlining the process below:
[Part 1] - Building an identity-assuming GraphQL client in a Lambda layer [Part 2] - Building the cascade deletion serverless express Lambda function [Part 3] - Building a REST API for the Lambda functions and accessing the endpoints from the client
+1. Related to https://github.com/aws-amplify/amplify-category-api/issues/623
Big +1 here still -- looking for this with Amplify Gen 2. I would love it if you flagged something as required like the below, that automatically tells the backend to auto-delete the entire object if that value becomes invalid.
Alternatively, you could create a new tag called autoDeleteParentUponInvalidation.
Friendship: a
.model({
id: a.id().required(),
receiverId: a.id().required(), // option #1, making it required forces the object to delete when it is invalid
receiver: a.belongsTo("User", "receiverId"),
senderId: a.id().required().autoDeleteParentUponInvalidation() // option #2, tell the backend to do this directly,
sender: a.belongsTo("User", "senderId"),
status: a.ref("FriendStatus").required(),
owners: a.string().array(),
})
This would literally save me hundreds of lines of code in my current project, with how many queries and deletes I have to run prior to deleting a User object.
Is there any solution for that in graphql api ?