amplify-backend
amplify-backend copied to clipboard
CustomResource table create fails on first attempt when backups are enabled
Environment information
System:
OS: Linux 5.15 Ubuntu 24.04.2 LTS 24.04.2 LTS (Noble Numbat)
CPU: (12) arm64 unknown
Memory: 28.50 GB / 31.17 GB
Shell: /bin/bash
Binaries:
Node: 22.15.0 - ~/.local/share/mise/installs/node/22.15.0/bin/node
Yarn: undefined - undefined
npm: 11.3.0 - ~/.local/share/mise/installs/node/22.15.0/bin/npm
pnpm: undefined - undefined
NPM Packages:
@aws-amplify/auth-construct: 1.6.0
@aws-amplify/backend: 1.14.0
@aws-amplify/backend-auth: 1.5.0
@aws-amplify/backend-cli: 1.4.8
@aws-amplify/backend-data: 1.4.0
@aws-amplify/backend-deployer: 1.1.15
@aws-amplify/backend-function: 1.12.1
@aws-amplify/backend-output-schemas: 1.4.0
@aws-amplify/backend-output-storage: 1.1.4
@aws-amplify/backend-secret: 1.1.5
@aws-amplify/backend-storage: 1.2.4
@aws-amplify/cli-core: 1.2.3
@aws-amplify/client-config: 1.5.5
@aws-amplify/deployed-backend-client: 1.5.0
@aws-amplify/form-generator: 1.0.3
@aws-amplify/model-generator: 1.0.12
@aws-amplify/platform-core: 1.6.0
@aws-amplify/plugin-types: 1.8.0
@aws-amplify/sandbox: 1.2.10
@aws-amplify/schema-generator: 1.2.7
aws-amplify: 6.12.2
aws-cdk: 2.177.0
aws-cdk-lib: 2.177.0
typescript: 5.5.4
No AWS environment variables
No CDK environment variables
Describe the bug
We have enabled PITR on our backend tables (table.pointInTimeRecoveryEnabled = true;), however when new tables are added, the resource regularly fails to create with the following error:
Received response status [FAILED] from custom resource. Message returned: Backups are being enabled for the table: Todo-...-NONE. Please retry later (RequestId: ...)
This issue has come up during development, but a redeploy usually succeeds. We're currently trying to make our first deployment from dev to production where we're now creating several tables for the first time. We've re-tried the deployment several times and a different table seems to fail with this error each time. It seems that there's something in the CustomResource that isn't playing nice with PITR.
Any guidance here would be appreciated. At this point we're probably going to have to break the stack down and do one table at a time...
Reproduction steps
Deploy a new Model with PITR enabled.
Hi @cBiscuitSurprise, Thank you for reporting this issue. It appears to be a timing issue between the creation of the DynamoDB table and the enablement of Point-in-Time Recovery (PITR). To help us better understand the root cause, could you please share your backend.ts file?
Sorry got distracted with other stuff. This is blocking us again. Our backend infra is spread across many files (not just backend.ts). I can try to create a repo that replicates our issue. I think it's as simple as: create table, enable backups, deploy. Anytime I add new tables, this blocks the deployment. If I create the table first, deploy, then enable backups, deploy again, it works, but that's a pain.
Here's my util for enabling backups:
export function secureTables(backend: ChBackend) {
const { amplifyDynamoDbTables } = backend.data.resources.cfnResources;
for (const table of Object.values(amplifyDynamoDbTables)) {
table.pointInTimeRecoveryEnabled = true;
}
}
It also gets in the way anytime we add/remove relationships, since the table gets deleted and recreated (which is a separate nuisance ... it's really annoying to have our data blown away if we add/remove a relationship).
And then sometimes we get stuck in this: Received response status [FAILED] from custom resource. Message returned: Execution Already Exists: 'arn:aws:states:us-east-2:339713049787:execution:AmplifyTableWaiterStateMachine060600BC-PxonGMRHv92D:11564edb-9428-4258-9584-98240fb749ee' where we have to go delete the step-function and recreate it... This is really making us regret using Amplify. It's really going to suck the day that we irreversibly destroy our own production app just because of a simple table change.
Sorry got distracted with other stuff. This is blocking us again. Our backend infra is spread across many files (not just
backend.ts). I can try to create a repo that replicates our issue. I think it's as simple as: create table, enable backups, deploy. Anytime I add new tables, this blocks the deployment. If I create the table first, deploy, then enable backups, deploy again, it works, but that's a pain.Here's my util for enabling backups:
export function secureTables(backend: ChBackend) { const { amplifyDynamoDbTables } = backend.data.resources.cfnResources; for (const table of Object.values(amplifyDynamoDbTables)) { table.pointInTimeRecoveryEnabled = true; } } It also gets in the way anytime we add/remove relationships, since the table gets deleted and recreated (which is a separate nuisance ... it's really annoying to have our data blown away if we add/remove a relationship).
And then sometimes we get stuck in this:
Received response status [FAILED] from custom resource. Message returned: Execution Already Exists: 'arn:aws:states:us-east-2:339713049787:execution:AmplifyTableWaiterStateMachine060600BC-PxonGMRHv92D:11564edb-9428-4258-9584-98240fb749ee'where we have to go delete the step-function and recreate it... This is really making us regret using Amplify. It's really going to suck the day that we irreversibly destroy our own production app just because of a simple table change.
Stuck on the same issue. Adding PITR is the last line in backend.ts. Has not happened in Sandbox
Facing the same issue.
Current vs Expected Behavior
Current Behavior (PITR Enabled)
// User enables PITR in backend.ts
export function secureTables(backend: ChBackend) {
const { amplifyDynamoDbTables } = backend.data.resources.cfnResources;
for (const table of Object.values(amplifyDynamoDbTables)) {
table.pointInTimeRecoveryEnabled = true;
}
}
// First deployment attempt
❌ Received response status [FAILED] from custom resource.
Message returned: Backups are being enabled for the table: Todo-...-NONE.
Please retry later (RequestId: ...)
// Retry deployment
✅ Deployment succeeds
Expected Behavior
// User enables PITR in backend.ts
export function secureTables(backend: ChBackend) {
const { amplifyDynamoDbTables } = backend.data.resources.cfnResources;
for (const table of Object.values(amplifyDynamoDbTables)) {
table.pointInTimeRecoveryEnabled = true;
}
}
// First deployment attempt
✅ Deployment succeeds consistently
Hi @cBiscuitSurprise,
Thank you for reporting this issue. This is a known timing issue with Point-in-Time Recovery (PITR) enablement during fresh table deployments. We've identified this as a race condition in the custom resource that handles PITR configuration.
Root Cause:
The custom resource attempts to enable PITR before the DynamoDB table reaches ACTIVE state, causing the "Backups are being enabled" error. This is why retry deployments typically succeed - the table is ready by then.
Immediate Workaround:
// Deploy in two phases to avoid the timing issue
// 1. Deploy tables without PITR first
// 2. Then enable PITR in a second deployment
export function secureTables(backend: ChBackend) {
const { amplifyDynamoDbTables } = backend.data.resources.cfnResources;
for (const table of Object.values(amplifyDynamoDbTables)) {
table.pointInTimeRecoveryEnabled = true;
}
}
This issue affects multiple users and is tracked as a bug. The fix requires:
- Adding proper table waiter logic in the custom resource
- Implementing retry logic with exponential backoff
- Fixing Step Function execution conflicts during retries
Related Issues: #1654
We encourage community contributions to help resolve this issue. The fix would involve modifying the custom resource handler to properly wait for table readiness before enabling PITR.
Priority: This should be treated as a P2 bug since it affects production deployments and requires manual workarounds, though retry deployments typically succeed.
Thank you for your patience, and we appreciate the detailed reproduction steps and user feedback that helps us understand the scope of this issue.