motia icon indicating copy to clipboard operation
motia copied to clipboard

fix: add Large Payload Support via Temporary File Transport in call-step-file and node-runner

Open riturajFi opened this issue 2 months ago • 8 comments

Overview

This PR introduces a robust solution to handle large request payloads (JSON or otherwise) without changing existing data types or introducing environment flags. Previously, Motia serialized the entire request body into a single JSON string argument passed via argv to the runner. This caused OS-level E2BIG errors for large payloads (typically > 2–8MB).

The new design uses temporary files to transfer large data between the main process and language runners (Node, Python, Ruby), while preserving backward compatibility for smaller payloads.

Key Changes 🧠 call-step-file.ts

Added detection for payload size before spawning the runner.

When data exceeds a threshold (e.g., >1MB):

Writes the full event object to a secure temporary JSON file (in os.tmpdir()).

Passes only the path of this file to the runner via argv.

On completion or error, the temp directory is safely deleted to prevent disk bloat.

Maintains existing behavior for smaller requests to minimize overhead.

⚙️ node-runner.ts

Detects whether the received argument is:

A file path → loads JSON from file, executes handler, and deletes the file.

A JSON string → uses legacy flow (unchanged behavior).

Adds cleanup for temporary files even if the process is interrupted.

Maintains identical interface for user handlers — no breaking changes.

Benefits

✅ Removes OS argument size limitations (supports 100MB+ payloads). ✅ Preserves existing APIs, handlers, and request/response structures. ✅ Reduces memory duplication from repeated JSON serialization. ✅ Ensures secure, auto-cleaned temp directories (mode 0700, file mode 0600). ✅ Fully backward compatible and cross-language ready.

Testing Performed

Verified:

Small payloads (<1MB) → use legacy in-memory mode.

Large JSON payloads (5MB–200MB) → run successfully without E2BIG.

Automatic cleanup of temp files after step completion.

Regression-tested standard steps and event flows.

Next Steps

Add optional streaming support for extremely large binary bodies.

Integrate with multipart/form-data in future versions.

🎥 Demo:

Before -

Screencast from 2025-10-06 10-33-33.webm

After -

Screencast from 2025-10-06 10-39-10.webm

riturajFi avatar Oct 06 '25 06:10 riturajFi

@riturajFi is attempting to deploy a commit to the motia Team on Vercel.

A member of the Team first needs to authorize it.

vercel[bot] avatar Oct 06 '25 06:10 vercel[bot]

Thanks @sergiofilhowz for your comments! I just want to validate whether the approach is correct. I shall implement this for all the runners (+ write an implementation prompt for future runners) And write automted tests.

riturajFi avatar Oct 07 '25 03:10 riturajFi

Hi @sergiofilhowz , implemented the changes you asked for -

  1. made the file deletion async
  2. made the cleaning step idempotent to prevent race condition of two processes deleting and accessing the same file
  3. wrote tests for the feature

riturajFi avatar Oct 14 '25 06:10 riturajFi

This is good! I'm going to merge soon

sergiofilhowz avatar Oct 16 '25 12:10 sergiofilhowz

@riturajFi thanks for the PR, I want to merge it, but first we need to fix the pipeline issues

sergiofilhowz avatar Oct 16 '25 12:10 sergiofilhowz

@sergiofilhowz can you please review this? I fixed some minor concerns that were there. Can we merge it now?

riturajFi avatar Oct 20 '25 10:10 riturajFi

📦 This PR is large (>500 lines). Please ensure it has been properly tested.

github-actions[bot] avatar Oct 23 '25 14:10 github-actions[bot]

Hi @sergiofilhowz , the changes have been implemented according to your suggestion. Can you kindly review this?

riturajFi avatar Nov 05 '25 06:11 riturajFi

Hey, Does @sergiofilhowz and you djscussed about this issue on discord or are you still waiting for review?

rohitg00 avatar Dec 06 '25 16:12 rohitg00