nodejs-pubsub icon indicating copy to clipboard operation
nodejs-pubsub copied to clipboard

Error [ERR_STREAM_WRITE_AFTER_END]: write after end

Open glesage opened this issue 5 years ago • 33 comments

Environment details

  • OS: Google Cloud Container-Optimized OS
  • Node.js version: 10.17.0
  • npm version: 6.12.0
  • @google-cloud/pubsub version: 1.x (1.1.5, tried reverting down to 1.0.0, same error, currently using 0.21.1)

Steps to reproduce

Unclear so far but have been seeing it after publishing 10k+ messages to various topics. Will get back with more details when we can but if anyone else is seeing this to confirm this isn't a config issue let us know.

We are using our own wrapper around pubsub which you can find here: https://github.com/starchup/google-pubsub-wrapper, extremely bare bones

Error stacktrace


E 2019-11-19T16:49:45.701927239Z events.js:174
E 2019-11-19T16:49:45.702039214Z       throw er; // Unhandled 'error' event
E 2019-11-19T16:49:45.702047675Z       ^
E 2019-11-19T16:49:45.702054717Z 
E 2019-11-19T16:49:45.702060707Z Error [ERR_STREAM_WRITE_AFTER_END]: write after end
E 2019-11-19T16:49:45.702068184Z     at writeAfterEnd (_stream_writable.js:248:12)
E 2019-11-19T16:49:45.702074930Z     at ClientHttp2Stream.Writable.write (_stream_writable.js:296:5)
E 2019-11-19T16:49:45.702081282Z     at filterStack.sendMessage.then.message (/node_modules/@grpc/grpc-js/build/src/call-stream.js:372:34)
E 2019-11-19T16:49:45.702088817Z     at process._tickCallback (internal/process/next_tick.js:68:7)
E 2019-11-19T16:49:45.702094878Z Emitted 'error' event at:
E 2019-11-19T16:49:45.702121372Z     at errorOrDestroy (internal/streams/destroy.js:107:12)
E 2019-11-19T16:49:45.702164237Z     at onwriteError (_stream_writable.js:436:5)
E 2019-11-19T16:49:45.702169699Z     at onwrite (_stream_writable.js:461:5)
E 2019-11-19T16:49:45.702174990Z     at /node_modules/async-listener/glue.js:188:31
E 2019-11-19T16:49:45.702180323Z     at process._tickCallback (internal/process/next_tick.js:63:19)

Thanks!

glesage avatar Nov 20 '19 12:11 glesage

@glesage thanks for reporting! Do you know when this error starting popping up? Did it start after a dependency update or was it there from the start?

Also do you have any insight as to whether this might be coming from the publisher or the subscriber?

callmehiphop avatar Nov 20 '19 20:11 callmehiphop

This started happening for us when we rolled out version 1.1.5 a few days ago. Before then we were at 0.21.1 so pretty big bump indeed

Unfortunately I do not know for certain because the app that uses this is both a subscriber and publisher and the stack traces are very un-helpful /: No matter how many try catch blocks we add temporarily to debug they never seem to trigger and our app just crashes without correctly handling this error

glesage avatar Nov 20 '19 21:11 glesage

We are actually able to go up to v0.29.1 without this issue happening so it seems possible that v0.30.0 introduced it (or exposed it) - https://github.com/googleapis/nodejs-pubsub/releases/tag/v0.30.0

Maybe one of the changes to grpc extensions & dependencies has something to do with it. Will try to identify more clearly where it gets thrown tomorrow

glesage avatar Nov 20 '19 23:11 glesage

We've reverted to version 0.29.1 for now and been using for a week with no issues. Hopefully we'll find some time to debug this some more so we can upgrade but may be a while. Just throwing it out there for those who might be able to (:

glesage avatar Nov 29 '19 13:11 glesage

As a quick note just ran another test of this and still happening with 1.5.0

glesage avatar Jan 08 '20 12:01 glesage

Hello,

I can confirm the same behaviour. This usually happens after a couple of hours. After restarting the application everything works fine again. (v1.3)

psychonetic avatar Feb 07 '20 09:02 psychonetic

We are actually able to go up to v0.29.1 without this issue happening so it seems possible that v0.30.0 introduced it (or exposed it) - https://github.com/googleapis/nodejs-pubsub/releases/tag/v0.30.0

Maybe one of the changes to grpc extensions & dependencies has something to do with it. Will try to identify more clearly where it gets thrown tomorrow

Yeah, and this happened in 0.30.0, too:

https://github.com/googleapis/nodejs-pubsub/pull/658

We're looking at just reverting back to the gRPC C++ bindings for now. I'm talking with some folks on Monday before we commit to that. In the meantime, it's possible to manually switch to them and see if it helps:

const {PubSub} = require('@google-cloud/pubsub');
const grpc = require('grpc');
const pubsub = new PubSub({grpc});

feywind avatar Feb 07 '20 18:02 feywind

Is this still an ongoing problem for everyone? I'm hoping to get us onto grpc-js 1.x soon. That library has really stabilized lately, and is no longer considered beta, so it may help with anything that gRPC C++ previously helped with.

feywind avatar May 05 '20 17:05 feywind

Yess. This issue is affecting a user running @google-cloud/pubsub v1.5.0

megaz94 avatar May 06 '20 13:05 megaz94

I spoke with @megaz94 separately, but I wanted to check once more on this - @google-cloud/pubsub version 2.0.0 should be pulling in a newer (post-1.0) grpc-js, so that may help.

feywind avatar May 21 '20 23:05 feywind

We're also seeing this in @google-cloud/pubsub v1.5.0 too Did anybody confirm v2 is a fix? I need to choose between bumping to 2.0.0 or passing grpc bindings - can't do both with limited resources, but would prefer to bump to v2

naugtur avatar May 26 '20 10:05 naugtur

We are still seeing this error on 2.0.0

matetukacs avatar May 27 '20 21:05 matetukacs

@glesage @naugtur @matetukacs Can you all describe the character of data being sent and received which is problematic? To unpack that question a bit, I'm asking for e.g. the size of messages, the frequency of message, the length of time before the problem typically happens.

That error stack looks like something from readable-stream. For me, grpc-js is pulling readable-stream 2.3.7, which seems really old. I'm not convinced that's the cause yet, but it seems like we should update that at some point. @murgatroid99 Is there anything in there that sounds familiar and might explain that problem up top?

In general, it sort of looks like the HTTP/2 connection is getting closed on us and we don't realize it, and then we try to write more. The disconnection happens naturally at about 30 minutes in, and we're supposed to be catching that and restarting it. (There was a bug recently about that, which is why I was asking about the character of messages above.)

feywind avatar May 29 '20 23:05 feywind

grpc-js has never had a dependency on readable-stream, so I don't know where that could be coming from.

Regardless, a similar issue was filed in the google-gax repository and it looks like grpc/grpc-node#1436 will address it.

murgatroid99 avatar May 29 '20 23:05 murgatroid99

I think the external tag isn't the right thing here, but since there's a fix happening elsewhere potentially, I'm going to put it on there. I'll keep an eye on this for when we can bump the gax dependency.

feywind avatar Jun 03 '20 18:06 feywind

Experiencing this issue on 2.1.0

danielyaa5 avatar Jun 10 '20 19:06 danielyaa5

Experiencing this issue on 2.1.0

Confirmed - I'm getting this error after idling and not receiving a message for about 15 mins. @google-cloud/[email protected] and [email protected].

I think I'll try switching to the C++ build of grpc

6utt3rfly avatar Jun 11 '20 22:06 6utt3rfly

Ah, darn. I was hoping 1.0.5 would fix it. I'm going to go ahead and bump the dependency version anyway, in case it helps the others' issues.

feywind avatar Jun 12 '20 18:06 feywind

Quick update on this, the consensus was that we probably shouldn't actually do the explicit dependency on grpc-js in here, because it might fall out of sync with other libraries accidentally. The gax library has been updating their dependencies, so in theory, that should be pulling in a newer one now.

At some point I'll plan to update the gax dependency explicitly. This is related to this: https://github.com/googleapis/nodejs-pubsub/issues/879

feywind avatar Jul 21 '20 23:07 feywind

We are seeing this in v2.3.0 as well

shufengh avatar Aug 15 '20 00:08 shufengh

@shufengh It looks like 2.5.0 is the one that pulled in the new gax version explicitly. Maybe give that one a try?

https://github.com/googleapis/nodejs-pubsub/pull/1085

feywind avatar Aug 18 '20 22:08 feywind

Going through all the existing issues today... is there anyone who's still having this issue on the latest version of the library?

feywind avatar Nov 20 '20 20:11 feywind

We had to roll back to older version because the grpc.js was causing some trouble. Didn't check the latest yet.

naugtur avatar Nov 20 '20 20:11 naugtur

we are also experiencing the same issue, here is the stack trace. This issue happens when we have a long running event stream and it times out, all future request afterwards fail because the process is shutdown due to the uncaughtexception.

2020-11-25T20:18:17.836Z - error: write after end stack=Error [ERR_STREAM_WRITE_AFTER_END]: write after end
    at writeAfterEnd (_stream_writable.js:248:12)
    at ClientHttp2Stream.Writable.write (_stream_writable.js:296:5)
    at filterStack.sendMessage.then (.../node_modules/@grpc/grpc-js/build/src/call-stream.js:

pedramphp avatar Nov 25 '20 20:11 pedramphp

Is there any solution for this?

insanrizky avatar Dec 26 '20 04:12 insanrizky

Not that I'm aware, but there is a new grpc-js coming out soon that has more debug info that might help us figure it out. There are some related issues happening across several libraries at the moment.

feywind avatar Jan 11 '21 21:01 feywind

The new grpc-js update should be available now, though you might have to remove package-lock.json and npm install again to get it.

https://github.com/grpc/grpc-node/pull/1658

As before, please set these environment variables before running your app to see if we can get some more info:

GRPC_TRACE=all
GRPC_VERBOSITY=DEBUG

feywind avatar Jan 15 '21 19:01 feywind

We'll be upgrading soon and currently see this see this in production. I will follow up with our findings.

arbourd avatar Jan 15 '21 21:01 arbourd

@arbourd (or anyone else), any luck grabbing some more debug logs for the issue?

feywind avatar Feb 12 '21 21:02 feywind

Hey @feywind, thanks for reminding me. I'll go check out logs, hopefully the data is still around because I completely forgot about this.

A couple of quick observations:

  • We still see this with 2.8.0 (haven't tried the recently release 2.9.0 yet).
  • We only see it on subscriptions.

arbourd avatar Feb 13 '21 20:02 arbourd