deno icon indicating copy to clipboard operation
deno copied to clipboard

Node compatibility issue with handling multipart/form-data files

Open firecakes opened this issue 10 months ago • 8 comments

I've been noticing file corruption issues while letting the koa-body library handle multiple files with multipart/form-data requests. In the Deno server code where I'm using the npm modules some files get corrupted during the save while with NodeJS the files are saved fine. I'm using Deno v1.36.3, and the issue can reliably happen as long as you're selecting 20+ images on the browser as a multi-file upload to the server.

Client code that sends the request:

const formData = new FormData();
for (let i = 0; i < event.target.files.length; i++) {
  const file = event.target.files[i];
  formData.append('myFile', file);
}

const results = await axios.post('/api/file', formData, {
  headers: {
    'Content-Type': 'multipart/form-data'
  }
});

Server code in Deno:

import Koa from "npm:[email protected]";
import Router from "npm:@koa/[email protected]";
import serve from "npm:[email protected]";
import { koaBody } from "npm:[email protected]";
import { createReadStream } from "node:fs";
import http from "node:http";


// start the web server initialization
const app = new Koa();
const router = new Router();

app.use(koaBody({
  parsedMethods: ["POST", "PUT", "PATCH", "DELETE"], // add DELETE as parsed method
  multipart: true, // parse multipart form data
  formidable: { // modify where the form data gets saved
    uploadDir: "static/tmp",
    keepExtensions: true,
    maxFileSize: 10 * 1024 * 1024 * 1024,
  },
}));

// add a new file
router.post("/api/file", async (ctx, next) => {
  if (!Array.isArray(ctx.request.files.myFile)) {
    ctx.request.files.myFile = [ctx.request.files.myFile];
  }

  // rename the files to what they were originally and move them to the correct location
  for await (let file of ctx.request.files.myFile) {
    await Deno.rename(
      `static/tmp/${file.newFilename}`,
      `static/files/${file.originalFilename}`,
    );
  }


  ctx.body = {
    files: ctx.request.files.myFile.map((file) =>
      `files/${file.originalFilename}`
    ),
  };
});

// get files under static folder, and resolve "/" to index.html
app.use(serve("static"));

app.use(router.routes());
app.use(router.allowedMethods());

http.createServer(app.callback()).listen(8000);

Server code in NodeJS

import Koa from "koa";
import Router from "@koa/router";
import serve from "koa-static";
import { koaBody } from "koa-body";
import { createReadStream } from "fs";
import http from "http";
import { readdir, rename } from 'node:fs/promises';

// start the web server initialization
const app = new Koa();
const router = new Router();

app.use(koaBody({
  parsedMethods: ["POST", "PUT", "PATCH", "DELETE"], // add DELETE as parsed method
  multipart: true, // parse multipart form data
  formidable: { // modify where the form data gets saved
    uploadDir: "static/tmp",
    keepExtensions: true,
    maxFileSize: 10 * 1024 * 1024 * 1024,
  },
}));

// add a new file
router.post("/api/file", async (ctx, next) => {
  if (!Array.isArray(ctx.request.files.myFile)) {
    ctx.request.files.myFile = [ctx.request.files.myFile];
  }

  // rename the files to what they were originally and move them to the correct location
  for await (let file of ctx.request.files.myFile) {
    await rename(
      `static/tmp/${file.newFilename}`,
      `static/files/${file.originalFilename}`,
    );
  }


  ctx.body = {
    files: ctx.request.files.myFile.map((file) =>
      `files/${file.originalFilename}`
    ),
  };
});

// get files under static folder, and resolve "/" to index.html
app.use(serve("static"));

app.use(router.routes());
app.use(router.allowedMethods());

http.createServer(app.callback()).listen(8000);

firecakes avatar Aug 25 '23 15:08 firecakes

I can be more specific. I was testing uploading PNG files and then trying to read them back in the browser, and this is how I was noticing the data corruption issues, where the data in some of the files gets jumbled up and is unrecognizable as an image file.

The chance of each file being corrupted also appears to be correlated to its file size. For example, a bunch (15+) of images at 305B each had no corruptions. A bunch of 3.5KB images had only one corruption, and a bunch of large 2.1MB images had every image except for one corrupted.

firecakes avatar Aug 30 '23 22:08 firecakes

Updating file size to 5MB did the trick

Reproduction (Windows 10):

Node v20.5.1

npm ci node .\main.js

Note all hashes match


Deno v1.36.4 (release, x86_64-pc-windows-msvc)

deno run --allow-all ./main.js

Note second hash mismatch


repro.zip

Removing fs.writev helps induce the bug after only 2 iterations, (See main.js@50

This is my analysis so far

  1. In node_modules\.deno\[email protected]\node_modules\formidable\src\VolatileFile.js, put a conditional breakpoint on the this._writeStream.write@line 64 => this.newFilename.endsWith("1.bin") (This will cause breakpoint when second file starts to write)
  2. Start debugger, wait until previous breakpoint is hit
  3. Step down through the code until end up at _stream.mjs@3930 (state.ended || state.reading || state.destroyed || state.errored || !state.constructed)
  4. Note that it will write to state.buffered, because !state.constructed is false
  5. Run until breakpoint, now we are at second write iteration of the second file
  6. Note that it will write to state.buffered, because !state.constructed is false
  7. Note that buffer still has the previous data from iteration 1 in it also
  8. Run until breakpoint, now we are at third write iteration of the second file
  9. Note now the state.constructed is true, we will bypass the buffer
  10. Note that buffer still has the previous data from iteration 1 & 2 in it (An easier way to check this is conditional breakpoint on _stream.mjs@3944 => break when state.buffered.length > 0
  11. Data corrupted, buffer was not cleared before this third write (I checked in hex editor, bytes of the first write are from the third iteration)

image

cc @bartlomieju This is a reproducible data corruption issue

CoenraadS avatar Aug 31 '23 06:08 CoenraadS

Thanks for the report folks. We fixed a big bug in Socket class in v1.36.4, is it still reproducible in the latest version?

bartlomieju avatar Sep 02 '23 18:09 bartlomieju

@bartlomieju Yes, also occurs with deno 1.36.4 (release, x86_64-pc-windows-msvc)

CoenraadS avatar Sep 02 '23 18:09 CoenraadS

Thanks, we'll look into that!

bartlomieju avatar Sep 02 '23 18:09 bartlomieju

Any updates on this @bartlomieju? Also tried this with v1.37.2.

firecakes avatar Oct 31 '23 19:10 firecakes

Update: I haven't had issues with this since running newer versions of Deno (v1.38.3+). Seems like it got resolved, just wanted to make sure it was intentional before closing the issue.

firecakes avatar Jan 27 '24 22:01 firecakes

I tested my repro from https://github.com/denoland/deno/issues/20284#issuecomment-1700418050 with 1.40.2 and it still occurs.

However the 'happy path' when fs.writev is available seems to work ok.

.e.g.

Ok:
const fileStream = fs.createWriteStream(vf.filepath);

Potential corruption:
const fileStream = fs.createWriteStream(vf.filepath, { fs: { open: fs.open, write: fs.write, close: fs.close } }); // Removed fs.writev

CoenraadS avatar Jan 27 '24 22:01 CoenraadS

@CoenraadS I couldn't reproduce the data corruption using the repro.zip (main.js) with the latest Deno on Windows Server 2022 and MacOS 13.3. Can you still reproduce the error with the latest Deno?

Screenshot 2024-03-27 at 22 36 48

If you still can, how likely does it happen on your end?

kt3k avatar Mar 27 '24 13:03 kt3k

A createWriteStream issue has just been fixed in node.js, but it still exists in deno. It might be related to this issue.

Repro:

import fs from 'node:fs';
import process from 'node:process';

const w = fs.createWriteStream('file.txt');

w.on('open', () => {
  w.write('hello');

  process.nextTick(() => {
    w.write('world');
    w.end();
  });
});

w.on('close', () => {
  console.log(fs.readFileSync('file.txt', 'utf8'));
});

deno v1.42.0:

$ deno run --allow-all index.mjs

worldhello

dy-dx avatar Mar 29 '24 17:03 dy-dx

@kt3k

It still occurs for me, sending 2 files I think isn't enough, try bumping it up to a higher number. Usually within the first 10 I get a hash error.

image

I will reiterate though that the default happy path works ok, only when removing fs.writev does it occur (while node passes), which I only did while experimenting with creating a repro, not using this in real code.

Ok:
const fileStream = fs.createWriteStream(vf.filepath);

Potential corruption:
const fileStream = fs.createWriteStream(vf.filepath, { fs: { open: fs.open, write: fs.write, close: fs.close } }); // Removed fs.writev

CoenraadS avatar Mar 30 '24 12:03 CoenraadS

@CoenraadS Thanks. Now I'm able to reproduce the issue consistently with ~20 iterative checks on macOS.

~~The issue (with fs.writev excluded) seems to have started from Deno 1.39.3. It doesn't happen with 1.39.2 or lower.~~ Update: I used 50 sendFiles for checking it, but now I found the check has high chance of false positives, and 1.39.2 and lower version still seem having the issue.

I'll check it in more details soon

kt3k avatar Apr 03 '24 07:04 kt3k

@dy-dx Thanks for the input! ~~Looks related to this issue. I tried the patch https://github.com/nodejs/node/pull/52005 together with https://github.com/nodejs/node/pull/46818, but it didn't fix the given example (it still shows worldhello).. Do you think of any possible cause of it? (Maybe our Node stream polyfills are out-dated somehwere..)~~ Nevermind (I was modifying unrelated file.. 🤦 ). Appliying https://github.com/nodejs/node/pull/52005 seems fixing the given example, and also this issue.

kt3k avatar Apr 05 '24 11:04 kt3k