nodejs-storage icon indicating copy to clipboard operation
nodejs-storage copied to clipboard

Methods that perform a fetch should throw any socket errors directly (ECONNRESET)

Open ryanwilsonperkin opened this issue 1 year ago • 8 comments

Related: https://github.com/googleapis/google-cloud-node/issues/2254

Environment details

  • which product (packages/*): @google-cloud/storage
  • OS: Linux/MacOS
  • Node.js version: 20
  • npm version: 10.2.4
  • google-cloud-node version: 7.10.0

Steps to reproduce

When running multiple uploads in parallel, we frequently encounter ECONNRESET socket errors such as

FetchError: request to https://storage.googleapis.com/storage/v1/b/path-redacted failed, reason: write ECONNRESET
--
  | at ClientRequest.<anonymous> (/app/node_modules/.pnpm/[email protected]/node_modules/node-fetch/lib/index.js:1501:11)
  | at ClientRequest.emit (node:events:518:28)
  | at ClientRequest.emit (node:domain:488:12)
  | at TLSSocket.socketErrorListener (node:_http_client:495:9)
  | at TLSSocket.emit (node:events:518:28)
  | at TLSSocket.emit (node:domain:488:12)
  | at emitErrorNT (node:internal/streams/destroy:169:8)
  | at emitErrorCloseNT (node:internal/streams/destroy:128:3)
  | at processTicksAndRejections (node:internal/process/task_queues:82:21) {
  | type: 'system',
  | errno: 'ECONNRESET',
  | code: 'ECONNRESET'
  | }

These happen for us using the Bucket#upload method, and we would like to be able to catch the error and handle it appropriately, but the error is thrown in such a way that it can't be caught by a try/catch on Bucket#upload an instead needs to be captured at the process level by an event handler.

When GCS is instantiating the node-fetch client, the client should capture any FetchError and reject the promise with it.

ryanwilsonperkin avatar Apr 30 '24 21:04 ryanwilsonperkin

Fascinating, do you mind sharing a reproducible snippet? This should be captured via try { await … } catch (e) {}

danielbankhead avatar Jun 06 '24 20:06 danielbankhead

const crypto = require("node:crypto");
const fs = require("node:fs/promises");

// Replace with organization-specific values
const PROJECT_ID = "";
const BUCKET_NAME = "";
const NUM_TEST_FILES = 10_000;
const SOURCE_DIRECTORY = "/tmp/gcs-example";

async function createTestFiles() {
  const files = [];
  await fs.mkdir(SOURCE_DIRECTORY, { recursive: true });
  for (let i = 0; i < NUM_TEST_FILES; i++) {
    const uuid = crypto.randomUUID();
    const file = `${SOURCE_DIRECTORY}/${uuid}`;
    await fs.writeFile(file, uuid);
    files.push(file);
  }
  return files;
}

(async function main() {
  const files = await createTestFiles();
  const gcs = new Storage({ projectId: PROJECT_ID });
  const bucket = gcs.bucket(BUCKET_NAME);
  try {
    await Promise.all(files.map((file) => bucket.upload(file)));
  } catch (err) {
    // This should be able to catch the socket error but it does not
  }
})();

Here's a snippet that should be able to reproduce it. It will attempt to perform 10k concurrent uploads which is likely sufficient to encounter the socket error, but you may find that the content size of the files that are being created needs to be increased to be sufficient to trigger the issue

ryanwilsonperkin avatar Jun 06 '24 21:06 ryanwilsonperkin

That's strange - I'm able to capture the error just fine locally with the same code provided, with a few minor mods:

index.mjs

import crypto from "node:crypto";
import fs from "node:fs/promises";

import { Storage } from "@google-cloud/storage";

// Replace with organization-specific values
const NUM_TEST_FILES = 10_000;
const SOURCE_DIRECTORY = "/tmp/gcs-example";

async function createTestFiles() {
  const files = [];
  await fs.mkdir(SOURCE_DIRECTORY, { recursive: true });
  for (let i = 0; i < NUM_TEST_FILES; i++) {
    const uuid = crypto.randomUUID();
    const file = `${SOURCE_DIRECTORY}/${uuid}`;
    await fs.writeFile(file, uuid);
    files.push(file);
  }
  return files;
}

async function main() {
  console.log(`Creating ${NUM_TEST_FILES} test file(s)...`);
  const files = await createTestFiles();
  console.log("...created.");

  const gcs = new Storage();
  const bucket = gcs.bucket(process.env.BUCKET_NAME);
  try {
    await Promise.all(files.map((file) => bucket.upload(file)));
  } catch (err) {
    // This should be able to catch the socket error but it does not
    console.dir({ err });
  }

  console.log("done.");
}

await main();

Which version of storage are you using?

danielbankhead avatar Jun 07 '24 21:06 danielbankhead

*ah, I'm assuming google-cloud-node version => @google-cloud/storage. This seems to work fine in @google-cloud/[email protected]

danielbankhead avatar Jun 07 '24 21:06 danielbankhead

We're currently on @google-cloud/[email protected], anything that was meaningfully changed in the difference that would cause that to be the case?

Is it the ECONNRESET error that you're catching in the catch block?

ryanwilsonperkin avatar Jun 07 '24 22:06 ryanwilsonperkin

There were a number of changes upstream that may have resolved this; after digging I wasn't able to point to a particular change that would have fixed this.

Is it the ECONNRESET error that you're catching in the catch block?

Yep, I see that error being caught.

danielbankhead avatar Jun 10 '24 18:06 danielbankhead

@ryanwilsonperkin @danielbankhead it sounds like this has been resolved?

ddelgrosso1 avatar Jun 26 '24 19:06 ddelgrosso1

I intend to come back to this when work slows down a bit, but the issue still appears to be present in our environment. Going to see if I can get a more representative reproduction case

ryanwilsonperkin avatar Jun 26 '24 19:06 ryanwilsonperkin

I can also reproduce this (inconsistently), but only in our application. Using the scripts above, I never get ECONNRESET - but I do get other errors, which are handled correctly.

Here's an example of the error we see:

node:internal/process/promises:289
            triggerUncaughtException(err, true /* fromPromise */);
            ^
GaxiosError: request to https://storage.googleapis.com/upload/storage/v1/b/bucket/o?name=filename&uploadType=resumable failed, reason: read ECONNRESET
    at Gaxios._request (/path/to/our/app/node_modules/.pnpm/[email protected][email protected]/node_modules/gaxios/src/gaxios.ts:183:13)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async Impersonated.requestAsync (/path/to/our/app/node_modules/.pnpm/[email protected][email protected]/node_modules/google-auth-library/build/src/auth/oauth2client.js:405:18)
    at async Upload.makeRequest (/path/to/our/app/node_modules/.pnpm/@[email protected][email protected]/node_modules/@google-cloud/storage/build/cjs/src/resumable-upload.js:743:21)
    at async uri.retries (/path/to/our/app/node_modules/.pnpm/@[email protected][email protected]/node_modules/@google-cloud/storage/build/cjs/src/resumable-upload.js:410:29)
    at async Upload.createURIAsync (/path/to/our/app/node_modules/.pnpm/@[email protected][email protected]/node_modules/@google-cloud/storage/build/cjs/src/resumable-upload.js:407:21) {
  config: {
    method: 'POST',
    url: 'https://storage.googleapis.com/upload/storage/v1/b/bucket/o?name=filename&uploadType=resumable',
    params: {
      name: 'filename',
      uploadType: 'resumable'
    },
    data: { contentEncoding: 'gzip' },
    headers: {
      'User-Agent': 'gcloud-node-storage/7.9.0 google-api-nodejs-client/9.7.0',
      'x-goog-api-client': 'gl-node/20.11.1 gccl/7.9.0-CJS gccl-invocation-id/a46ab3ae-18da-4e7e-b105-ebc1794756dd',
      'X-Upload-Content-Type': 'application/x-ndjson',
      Authorization: 'Bearer snip',
      'Content-Type': 'application/json'
    },
    validateStatus: [Function (anonymous)],
    paramsSerializer: [Function: paramsSerializer],
    body: '{"contentEncoding":"gzip"}',
    responseType: 'unknown',
    errorRedactor: [Function: defaultErrorRedactor]
  },
  response: undefined,
  error: FetchError: request to https://storage.googleapis.com/upload/storage/v1/b/bucket/o?name=filename&uploadType=resumable failed, reason: read ECONNRESET
      at ClientRequest.<anonymous> (/path/to/our/app/node_modules/.pnpm/[email protected][email protected]/node_modules/node-fetch/lib/index.js:1505:11)
      at ClientRequest.emit (node:events:518:28)
      at ClientRequest.emit (node:domain:488:12)
      at TLSSocket.socketErrorListener (node:_http_client:495:9)
      at TLSSocket.emit (node:events:518:28)
      at TLSSocket.emit (node:domain:488:12)
      at emitErrorNT (node:internal/streams/destroy:169:8)
      at emitErrorCloseNT (node:internal/streams/destroy:128:3)
      at process.processTicksAndRejections (node:internal/process/task_queues:82:21) {
    type: 'system',
    errno: 'ECONNRESET',
    code: 'ECONNRESET'
  },
  code: 'ECONNRESET'
}

smcgivern avatar Jul 15 '24 14:07 smcgivern

@smcgivern I’m not too familiar with pnpm, however I’m seeing [email protected] in the log. Do you mind checking the gaxios version and potentially updating?

danielbankhead avatar Jul 16 '24 17:07 danielbankhead

pnpm's just an npm replacement, but the library version point is a fair one - having upgraded, I haven't been able to reproduce yet (although I wasn't able to reproduce consistently in the first place). I will report back if I see the same problem on the latest versions of this library + gaxios.

smcgivern avatar Jul 19 '24 10:07 smcgivern

Thanks, closing for now.

danielbankhead avatar Jul 23 '24 17:07 danielbankhead