nsq icon indicating copy to clipboard operation
nsq copied to clipboard

nsqd: stops sending messages after moving time backward

Open vostrik opened this issue 6 years ago • 13 comments

Environment/Pre-Conditions nsqd version: 1.1.0 Node.js application with nsqjs

Steps to Reproduce:

  1. Run nsqd and Node.js application.
  2. Generate some messages. Everything works correctly.
  3. Move time backward (for ex.: 5 minutes or 1 hour).
  4. Generate some messages. Subscriber does't receive messages. 5.1. After some time (may be delay from (3)?) message will be published. 5.2. If I move time forward then messages will be published immediately.

Actual Result:

Messages aren't sent to subscriber after moving time back.

Expected Result:

Messages are sent to subscriber after moving time back.

vostrik avatar Jan 11 '19 15:01 vostrik

I generally expect various things to not work right when system time moves backwards :)

Go-1.9 added "transparent monotonic time" in certain cases: https://golang.org/doc/go1.9#monotonic-time There may be ways we can adjust nsq code to get the monotonic-time based comparisons and timeouts ... if nsqd is the problem here rather than the nodejs client library ...

ploxiln avatar Jan 11 '19 20:01 ploxiln

I think It is certainly nsqd because nodejs restart didn’t help.

vostrik avatar Jan 11 '19 20:01 vostrik

Can I help with this issue? Maybe you can prompt modules and methods names which should be refactored.

vostrik avatar Jan 11 '19 21:01 vostrik

thanks, we'll look into it

ploxiln avatar Jan 11 '19 21:01 ploxiln

since Go does not use separate fuctions/options/objects for monotonic time, but includes it hidden in the wall-clock time struct, it's probably a bit tricky to figure out where it's being lost, or how to keep it along where needed

ploxiln avatar Jan 11 '19 21:01 ploxiln

I'd be surprised if "normal" channel sends were the problem, so I'd start by looking at other timeout related things. The two that come to mind are:

  1. network connection level timeouts
  2. time.Ticker use cases, e.g. to flush buffered messages to a client

mreiferson avatar Jan 11 '19 21:01 mreiferson

Sorry for bothering but is there any progress? Our team has some resources to investigate this behaviour, i. e. you can simply point potential problem places. We want to help.

vostrik avatar Jan 17 '19 14:01 vostrik

No progress.

I suspect the issue may be around https://github.com/nsqio/nsq/blob/master/nsqd/guid.go#L59

ploxiln avatar Jan 17 '19 17:01 ploxiln

I suspect the issue may be around https://github.com/nsqio/nsq/blob/master/nsqd/guid.go#L59

Ahhh, yes, I completely forgot about the GUID code, good call!

I had been poking around at all the network deadlines and tickers, but I'm pretty confident they're not the issue.

mreiferson avatar Jan 17 '19 17:01 mreiferson

Can we add to guid some clock sequence (14 bit). This section helps with backwards time travelling. More here: https://blog.stephencleary.com/2010/11/few-words-on-guids.html

vostrik avatar Jan 18 '19 11:01 vostrik

Hmmm, I'm not sure we have space for that.

I think we can just use a time duration rather than timestamp, and Go will transparently handle the monotonicity for us.

mreiferson avatar Jan 18 '19 15:01 mreiferson

The last time we visited this issue was #658 / #663

We didn't want to drastically change the GUID generation algorithm. Although we'd like to say the only guarantee is that IDs are unique within a particular consumer connection to nsqd, it's possible that some users may implicitly depend on the uniqueness within a channel across multiple nsqd and restarts of nsqd.

An alternate algorithm that would lose some of that across-sources-and-restarts uniqueness, but be compatible with the odd and inadvisable condition of time going backwards: start at a random initial value, and just increment and wrap (similar to tcp sequence numbers but 64-bit).

ploxiln avatar Jan 18 '19 21:01 ploxiln

Hmmm, I do vaguely remember discussing this and being frustrated about the "backwards compatibility".

mreiferson avatar Jan 20 '19 23:01 mreiferson