testkube icon indicating copy to clipboard operation
testkube copied to clipboard

Support Document Retention Policy Configuration for MongoDB via Helm Chart

Open JaroVojtek opened this issue 11 months ago • 5 comments

Description

We recently encountered performance issues with our Testkube deployment, which we traced back to the MongoDB instance used for storing workflow execution results. The root cause was the unbounded growth of the testworkflowresults collection:

> db.testworkflowresults.countDocuments()
// 218361

Following the Testkube MongoDB administration documentation, we manually implemented a TTL index to purge old execution data:

db.testworkflowresults.createIndex(
  { createdAt: 1 },
  { expireAfterSeconds: 604800 } // 7 days
)

After this change, the document count dropped and performance significantly improved:

> db.testworkflowresults.countDocuments()
// 10671

To automate this for future installations, we created an initdbScript to apply this TTL index. This introduces extra complexity and maintenance burden and we are not really sure if this is the correct or only this available automation approach

mongodb:
  initdbScripts:
    create-indexes.sh: |
      #!/bin/bash
      until mongosh --eval "print(\"waited for connection\")" > /dev/null 2>&1; do
        sleep 2
      done

      TIMEOUT=300
      START_TIME=$(date +%s)
      TIMEOUT_REACHED=false

      until mongosh --eval '
      db = db.getSiblingDB("testkube");
      if (db.getCollectionNames().length > 0) {
        print("testkube database is ready");
        exit(0);
      } else {
        print("waiting for testkube database...");
        exit(1);
      }
      ' > /dev/null 2>&1; do
        CURRENT_TIME=$(date +%s)
        ELAPSED_TIME=$((CURRENT_TIME - START_TIME))

        if [ $ELAPSED_TIME -ge $TIMEOUT ]; then
          echo "Timeout reached: testkube database not available after 5 minutes"
          TIMEOUT_REACHED=true
          break
        fi

        echo "Waiting for testkube database to be available... (${ELAPSED_TIME}s elapsed)"
        sleep 5
      done

      if [ "$TIMEOUT_REACHED" = true ]; then
        echo "Exiting due to timeout"
        exit 1
      fi

      mongosh --eval '
      db = db.getSiblingDB("testkube");
      db.testworkflowresults.createIndex(
        { createdAt: 1 },
        { expireAfterSeconds: 604800 }
      );
      '

Feature Request:

Instead of requiring manual TTL index creation or custom init scripts, we propose that Testkube should support a retention policy configuration for data stored in MongoDB, defined via Helm chart values. Internally, Testkube could then translate this into the appropriate TTL index creation logic.

testkube:
  dataRetention:
    enabled: true
    duration: 7d  # Automatically remove results older than 7 days

This would allow users to define their retention strategy at the configuration level without needing to understand the underlying MongoDB index mechanics.

Thank you for the excellent work on Testkube—we’d love to see this improvement considered!

JaroVojtek avatar May 06 '25 12:05 JaroVojtek

Thank you @JaroVojtek Great suggestiion! For @olensmar and @jmorante-ks to prioiritise

vsukhin avatar May 06 '25 14:05 vsukhin

Hello @vsukhin any update on this ?

Thank you

JaroVojtek avatar May 26 '25 11:05 JaroVojtek

Hi @JaroVojtek - thanks for this request; it's in our backlog, making its way through the process.. very sorry to keep you waiting 🙏

olensmar avatar May 26 '25 11:05 olensmar

Hello @olensmar thank you for the update :)

JaroVojtek avatar May 27 '25 07:05 JaroVojtek

Hello @olensmar any update on this ? Thank you

JaroVojtek avatar Jun 26 '25 07:06 JaroVojtek