spring-data-mongodb icon indicating copy to clipboard operation
spring-data-mongodb copied to clipboard

Misleading Change Streams documentation

Open michaldo opened this issue 1 year ago • 2 comments

Change Streams can be consumed with both, the imperative and the reactive MongoDB Java driver. It is highly recommended to use the reactive variant, as it is less resource-intensive.

src/main/antora/modules/ROOT/pages/mongodb/change-streams.adoc

I do not understand why reactive variant is highly recommended. I follow example give in documentation part interactive, and I see one thread is started and listen changes on Mongo socket. Is one extra thread resource intensive? What about virtual threads?

If I use standard Mongo client in application and I want use Change Stream, should I a) rewrite application b) use 2 clients in single application c) choose not recommended variant ?

IMHO documentation preface presents false statement or rough opinion without justification/explanation. The preface is misleading and may cause wrong developers decision.

I propose remove the sentence at all or clarify resource utilization subject

michaldo avatar Jun 13 '24 18:06 michaldo

Reactive Change Streams start their activity from a push-based I/O mechanism while imperative Change Streams (also, tailable cursors) require constant polling. Polling is a repetitive activity that consumes computation time while there might be no data at all. Additionally, each imperative change stream requires one thread resulting in a linear relationship between the number of change streams being consumed vs. CPU and memory cost.

Reactive Change streams are constrained in the number of their I/O threads to roughly the number of CPU cores regardless the number of concurrent change streams being consumed.

We introduced change stream and tailable cursor support in 2018. I think in late 2018, the JDK team started considering lightweight threads if I'm not mistaken. I think it is worth exploring how Virtual Threads behave if you enable these on your Executor, let us know how this goes.

mp911de avatar Jun 14 '24 11:06 mp911de

Thanks Mark for answer. I was focused on thread issue and I miss poll issue.

My findings:

  1. Both standard and reactive solution capture changes by calling one per second query: DEBUG o.m.driver.protocol.command - Command "getMore" started... At this point my reactor skill ends. For standard client, when I used platform thread, the thread dump is:
"my-platform-cdc" #58 [34188] prio=5 os_prio=0 cpu=0.00ms elapsed=4.93s tid=0x000001e6153831b0 nid=34188 runnable  
[0x0000003f225fe000]
   java.lang.Thread.State: RUNNABLE
	at sun.nio.ch.Net.poll([email protected]/Native Method)
	at sun.nio.ch.NioSocketImpl.park([email protected]/NioSocketImpl.java:191)
	at sun.nio.ch.NioSocketImpl.park([email protected]/NioSocketImpl.java:201)
	at sun.nio.ch.NioSocketImpl.implRead([email protected]/NioSocketImpl.java:309)
	at sun.nio.ch.NioSocketImpl.read([email protected]/NioSocketImpl.java:346)
	at sun.nio.ch.NioSocketImpl$1.read([email protected]/NioSocketImpl.java:796)
	at java.net.Socket$SocketInputStream.read([email protected]/Socket.java:1109)
	at com.mongodb.internal.connection.SocketStream.read(SocketStream.java:176)

When I use virtual thread, thred dump is:

#56 "my-virt-cdc" virtual
      java.base/java.lang.VirtualThread.park(VirtualThread.java:582)
      java.base/java.lang.System$2.parkVirtualThread(System.java:2643)
      java.base/jdk.internal.misc.VirtualThreads.park(VirtualThreads.java:54)
      java.base/java.util.concurrent.locks.LockSupport.park(LockSupport.java:369)
      java.base/sun.nio.ch.Poller.pollIndirect(Poller.java:139)
      java.base/sun.nio.ch.Poller.poll(Poller.java:102)
      java.base/sun.nio.ch.Poller.poll(Poller.java:87)
      java.base/sun.nio.ch.NioSocketImpl.park(NioSocketImpl.java:175)
      java.base/sun.nio.ch.NioSocketImpl.park(NioSocketImpl.java:201)
      java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:309)
      java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:346)
      java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:796)
      java.base/java.net.Socket$SocketInputStream.read(Socket.java:1109)
      com.mongodb.internal.connection.SocketStream.read(SocketStream.java:176)

My conclusion is:

  1. Both standard and reactive are effectively poll solution, because Command "getMore" is executed one per second.
  2. I guess it is not good that platform thread is RUNNABLE - otherwise you will not prefer reactive in the documentation preface.
  3. I guess it is good that virtual thread is parked (java.lang.VirtualThread.park(VirtualThread.java:582))

michaldo avatar Jun 15 '24 08:06 michaldo

As for now, the issue isn't actionable and therefore I'm closing it.

mp911de avatar Aug 01 '24 08:08 mp911de