pm2 icon indicating copy to clipboard operation
pm2 copied to clipboard

PM2 error: read ENOTCONN

Open ksanderer opened this issue 2 years ago • 3 comments

What's going wrong?

It looks like pm2 while restarting an app after error getting unexpected error itself. For now I don't understand what is going and how to replicate the issue. Maybe someone can explain me an error and give me some idea of the direction where to digg.

2022-03-31T22:08:58: PM2 log: App [strategy--mm--sub1:4740] starting in -fork mode-
2022-03-31T22:08:58: PM2 log: App [strategy--mm--sub1:4740] online
2022-03-31T22:29:02: PM2 log: App [strategy--mm--sub1:4740] exited with code [1] via signal [SIGINT]
2022-03-31T22:29:12: PM2 log: App [strategy--mm--sub1:4740] starting in -fork mode-
2022-03-31T22:29:12: PM2 log: App [strategy--mm--sub1:4740] online
2022-03-31T22:29:12: PM2 error: Error: spawn /usr/bin/python3 EAGAIN
    at Process.ChildProcess._handle.onexit (node:internal/child_process:282:19)
    at onErrorNT (node:internal/child_process:477:16)
    at processTicksAndRejections (node:internal/process/task_queues:83:21)
2022-03-31T22:29:12: PM2 log: ===============================================================================
2022-03-31T22:29:12: PM2 log: --- PM2 global error caught ---------------------------------------------------
2022-03-31T22:29:12: PM2 log: Time                 : Thu Mar 31 2022 22:29:12 GMT+0000 (Coordinated Universal Time)
2022-03-31T22:29:12: PM2 error: read ENOTCONN
2022-03-31T22:29:12: PM2 error: Error: read ENOTCONN
    at tryReadStart (node:net:592:20)
    at Socket._read (node:net:603:5)
    at Socket.Readable.read (node:internal/streams/readable:487:10)
    at Socket.read (node:net:644:39)
    at new Socket (node:net:401:12)
    at Object.Socket (node:net:284:41)
    at createSocket (node:internal/child_process:327:14)
    at ChildProcess.spawn (node:internal/child_process:444:23)
    at spawn (node:child_process:698:9)
    at /usr/lib/node_modules/pm2/lib/God/ForkMode.js:118:20
2022-03-31T22:29:12: PM2 log: ===============================================================================
2022-03-31T22:29:12: PM2 error: [PM2] Resurrecting PM2
node:events:368
      throw er; // Unhandled 'error' event
      ^

<ref *1> Error: read ENOTCONN
    at tryReadStart (node:net:592:20)
    at Socket._read (node:net:603:5)
    at Socket.Readable.read (node:internal/streams/readable:487:10)
    at Socket.read (node:net:644:39)
    at new Socket (node:net:401:12)
    at Object.Socket (node:net:284:41)
    at createSocket (node:internal/child_process:327:14)
    at ChildProcess.spawn (node:internal/child_process:444:23)
    at spawn (node:child_process:698:9)
    at /usr/lib/node_modules/pm2/lib/God/ForkMode.js:118:20
Emitted 'error' event on Domain instance at:
    at Socket.emit (node:domain:525:12)
    at emitErrorNT (node:internal/streams/destroy:157:8)
    at emitErrorCloseNT (node:internal/streams/destroy:122:3)
    at processTicksAndRejections (node:internal/process/task_queues:83:21) {
  errno: -107,
  code: 'ENOTCONN',
  syscall: 'read',
  domainEmitter: Socket {
    connecting: false,
    _hadError: false,
    _parent: null,
    _host: null,
    _readableState: ReadableState {
      objectMode: false,
      highWaterMark: 16384,
      buffer: BufferList { head: null, tail: null, length: 0 },
      length: 0,
      pipes: [],
      flowing: true,
      ended: false,
      endEmitted: false,
      reading: true,
      constructed: true,
      sync: false,
      needReadable: true,
      emittedReadable: false,
      readableListening: false,
      resumeScheduled: true,
      errorEmitted: true,
      emitClose: false,
      autoDestroy: true,
      destroyed: true,
      errored: [Circular *1],
      closed: true,
      closeEmitted: false,
      defaultEncoding: 'utf8',
      awaitDrainWriters: null,
      multiAwaitDrain: false,
      readingMore: false,
      dataEmitted: false,
      decoder: null,
      encoding: null,
      [Symbol(kPaused)]: false
    },
    _events: [Object: null prototype] {
      end: [Function: onReadableStreamEnd],
      close: [Function (anonymous)],
      data: [Function: forkErrData]
    },
    _eventsCount: 3,
    _maxListeners: undefined,
    _writableState: WritableState {
      objectMode: false,
      highWaterMark: 16384,
      finalCalled: false,
      needDrain: false,
      ending: false,
      ended: false,
      finished: false,
      destroyed: true,
      decodeStrings: false,
      defaultEncoding: 'utf8',
      length: 0,
      writing: false,
      corked: 0,
      sync: true,
      bufferProcessing: false,
      onwrite: [Function: bound onwrite],
      writecb: null,
      writelen: 0,
      afterWriteTickInfo: null,
      buffered: [],
      bufferedIndex: 0,
      allBuffers: true,
      allNoop: true,
      pendingcb: 0,
      constructed: true,
      prefinished: false,
      errorEmitted: true,
      emitClose: false,
      autoDestroy: true,
      errored: [Circular *1],
      closed: true,
      closeEmitted: false,
      writable: false,
      [Symbol(kOnFinished)]: []
    },
    allowHalfOpen: false,
    _sockname: null,
    _pendingData: null,
    _pendingEncoding: '',
    server: null,
    _server: null,
    [Symbol(async_id_symbol)]: 92180833,
    [Symbol(kHandle)]: null,
    [Symbol(kSetNoDelay)]: false,
    [Symbol(lastWriteQueueSize)]: 0,
    [Symbol(timeout)]: null,
    [Symbol(kBuffer)]: null,
    [Symbol(kBufferCb)]: null,
    [Symbol(kBufferGen)]: null,
    [Symbol(kCapture)]: false,
    [Symbol(kBytesRead)]: 0,
    [Symbol(kBytesWritten)]: 0
  },
  domainThrown: false
}

How could we reproduce this issue?

Right now have no idea.

Supporting information

--- PM2 report ----------------------------------------------------------------
Date                 : Fri Apr 01 2022 11:20:44 GMT+0000 (Coordinated Universal Time)
===============================================================================
--- Daemon -------------------------------------------------
pm2d version         : 5.1.2
node version         : 16.14.2
node path            : /usr/bin/pm2
argv                 : /usr/bin/node,/usr/lib/node_modules/pm2/lib/Daemon.js
argv0                : node
user                 : ec2-user
uid                  : 1000
gid                  : 1000
uptime               : 3min
===============================================================================
--- CLI ----------------------------------------------------
local pm2            : 5.1.2
node version         : 16.14.2
node path            : /usr/bin/pm2
argv                 : /usr/bin/node,/usr/bin/pm2,report
argv0                : node
user                 : ec2-user
uid                  : 1000
gid                  : 1000
===============================================================================
--- System info --------------------------------------------
arch                 : x64
platform             : linux
type                 : Linux
cpus                 : Intel(R) Xeon(R) Platinum 8124M CPU @ 3.00GHz
cpus nb              : 36
freemem              : 30082981888
totalmem             : 73638830080
home                 : /home/ec2-user
===============================================================================

ksanderer avatar Apr 01 '22 11:04 ksanderer

It looks like the issue somehow connected to fork bomb because when the error occurs server stuck in '-bash: fork: retry: No child processes' and unable to perform any command.

But interesting fact is that error occurs after very first app crash, basically it's the first fork.

I have ~230 running apps (microservices) in config. Can it be somehow connected to the issue?

ksanderer avatar Apr 01 '22 12:04 ksanderer

I am having the exact same issue on my end. I'm running pm2 on a linux server in the same way that we have for years, and about 2 weeks ago we rebooted the machine, and have been getting this error every few days ever since. This morning it is occurring within 10 seconds of the command pm2 resurrect which brings 5 processes back up before this error crashes the whole pm2 instance (and all processes). Can anyone help?

Error from log and pm2 report:
2022-06-07T09:07:14: PM2 log: App [phettest-server:0] online
2022-06-07T09:07:14: PM2 log: ===============================================================================
2022-06-07T09:07:14: PM2 log: --- PM2 global error caught ---------------------------------------------------
2022-06-07T09:07:14: PM2 log: Time                 : Tue Jun 07 2022 09:07:14 GMT-0600 (Mountain Daylight Time)
2022-06-07T09:07:14: PM2 error: read ENOTCONN
2022-06-07T09:07:14: PM2 error: Error: read ENOTCONN
    at tryReadStart (net.js:574:20)
    at Socket._read (net.js:585:5)
    at Socket.Readable.read (_stream_readable.js:481:10)
    at Socket.read (net.js:625:39)
    at new Socket (net.js:377:12)
    at Object.Socket (net.js:269:41)
    at createSocket (internal/child_process.js:314:14)
    at ChildProcess.spawn (internal/child_process.js:433:23)
    at spawn (child_process.js:553:9)
    at Object.execFile (child_process.js:237:17)
2022-06-07T09:07:14: PM2 log: ===============================================================================
2022-06-07T09:07:14: PM2 error: [PM2] Resurrecting PM2
events.js:292
      throw er; // Unhandled 'error' event
      ^

<ref *1> Error: read ENOTCONN
    at tryReadStart (net.js:574:20)
    at Socket._read (net.js:585:5)
    at Socket.Readable.read (_stream_readable.js:481:10)
    at Socket.read (net.js:625:39)
    at new Socket (net.js:377:12)
    at Object.Socket (net.js:269:41)
    at createSocket (internal/child_process.js:314:14)
    at ChildProcess.spawn (internal/child_process.js:433:23)
    at spawn (child_process.js:553:9)
    at Object.execFile (child_process.js:237:17)
Emitted 'error' event on Domain instance at:
    at Socket.EventEmitter.emit (domain.js:536:12)
    at emitErrorNT (internal/streams/destroy.js:106:8)
    at emitErrorCloseNT (internal/streams/destroy.js:74:3)
    at processTicksAndRejections (internal/process/task_queues.js:80:21) {
  errno: -107,
  code: 'ENOTCONN',
  syscall: 'read',
  domainEmitter: Socket {
    connecting: false,
    _hadError: false,
    _parent: null,
    _host: null,
    _readableState: ReadableState {
      objectMode: false,
      highWaterMark: 16384,
      buffer: BufferList { head: null, tail: null, length: 0 },
      length: 0,
      pipes: [],
      flowing: true,
      ended: false,
      endEmitted: false,
      reading: true,
      sync: false,
      needReadable: true,
      emittedReadable: false,
      readableListening: false,
      resumeScheduled: true,
      errorEmitted: true,
      emitClose: false,
      autoDestroy: false,
      destroyed: true,
      errored: [Circular *1],
      closed: true,
      closeEmitted: false,
      defaultEncoding: 'utf8',
      awaitDrainWriters: null,
      multiAwaitDrain: false,
      readingMore: false,
      decoder: StringDecoder {
        encoding: 'utf8',
        [Symbol(kNativeDecoder)]: Buffer(7) [Uint8Array] [
          0, 0, 0, 0,
          0, 0, 1
        ]
      },
      encoding: 'utf8',
      [Symbol(kPaused)]: false
    },
    _events: [Object: null prototype] {
      end: [Function: onReadableStreamEnd],
      close: [Function (anonymous)],
      data: [Function: onChildStderr]
    },
    _eventsCount: 3,
    _maxListeners: undefined,
    _writableState: WritableState {
      objectMode: false,
      highWaterMark: 16384,
      finalCalled: false,
      needDrain: false,
      ending: false,
      ended: false,
      finished: false,
      destroyed: true,
      decodeStrings: false,
      defaultEncoding: 'utf8',
      length: 0,
      writing: false,
      corked: 0,
      sync: true,
      bufferProcessing: false,
      onwrite: [Function: bound onwrite],
      writecb: null,
      writelen: 0,
      afterWriteTickInfo: null,
      buffered: [],
      bufferedIndex: 0,
      allBuffers: true,
      allNoop: true,
      pendingcb: 0,
      prefinished: false,
      errorEmitted: true,
      emitClose: false,
      autoDestroy: false,
      errored: [Circular *1],
      closed: true,
      writable: false,
      closeEmitted: false
    },
    allowHalfOpen: false,
    _sockname: null,
    _pendingData: null,
    _pendingEncoding: '',
    server: null,
    _server: null,
    [Symbol(async_id_symbol)]: 7512,
    [Symbol(kHandle)]: null,
    [Symbol(kSetNoDelay)]: false,
    [Symbol(lastWriteQueueSize)]: 0,
    [Symbol(timeout)]: null,
    [Symbol(kBuffer)]: null,
    [Symbol(kBufferCb)]: null,
    [Symbol(kBufferGen)]: null,
    [Symbol(kCapture)]: false,
    [Symbol(kBytesRead)]: 0,
    [Symbol(kBytesWritten)]: 0
  },
  domainThrown: false
}
--- PM2 report ----------------------------------------------------------------
Date                 : Tue Jun 07 2022 09:15:34 GMT-0600 (Mountain Daylight Time)
===============================================================================
--- Daemon -------------------------------------------------
pm2d version         : 5.2.0
node version         : 14.15.0
node path            : /usr/local/bin/pm2
argv                 : /usr/local/bin/node,/usr/local/lib/node_modules/pm2/lib/Daemon.js
argv0                : node
user                 : phet-admin
uid                  : 454144
gid                  : 801
uptime               : 2min
===============================================================================
--- CLI ----------------------------------------------------
local pm2            : 5.2.0
node version         : 14.15.0
node path            : /usr/local/bin/pm2
argv                 : /usr/local/bin/node,/usr/local/bin/pm2,report
argv0                : node
user                 : phet-admin
uid                  : 454144
gid                  : 801
===============================================================================
--- System info --------------------------------------------
arch                 : x64
platform             : linux
type                 : Linux
cpus                 : Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz
cpus nb              : 40
freemem              : 67450916864
totalmem             : 270208983040
home                 : /home/phet-admin
===============================================================================

zepumph avatar Jun 07 '22 15:06 zepumph

I found my fix, it looks like on server reboot, our user nofile limit was reset back to 1024, so ENOTCONN occurred every time we tried to run more processes than that number. Thanks!

zepumph avatar Jun 10 '22 15:06 zepumph