Andy Pan comments

Results 356 comments of


                                            Andy Pan

opt: improve the performance of sending data

On what platform did you run the benchmark? Also which version of gnet you were using for benchmark?

opt: improve the performance of sending data

基于 I/O 多路复用的网络库的模式大多都是基于系统给的事件类型来做相应处理的。至于你说的这种发送快接收慢的客户端，只要是像 `gnet` 这样把业务层待写的数据保存起来，那就肯定都会有这个问题，即便是不替业务层保存待写的数据，那业务层就要自己保存这些数据，一样会有这个问题，这个也算是这一类网络框架经典的写出数据堆积问题了，性价比最高的一种解决办法是服务端设置一个阀值，业务层每次回写数据的时候都检查一下当前的待写数据堆积是否已经触达这个阀值，为了避免 OOM 通常需要服务端丢弃数据，`netty` 就是使用的这种方式，它提供了一个 `setWriteBufferHighWaterMark` 方法。至于 `gnet` 实现起来就更简单了：目前已经有了 `OutboundBuffered()` 可以随时检查当前的写出队列的堆积情况了，业务层自己就可以判断了。至于 evio 的 I/O 处理，反而是非主流的做法，虽然确实能解决这个问题，但因为这种做法每次只处理一种单一的事件，无法充分利用 `epoll` 的事件通知机制，性能上肯定有所损耗。还有你提出的解决的方法，虽然理论上也能解决这个问题，但是成本不小，一旦那么做了，后续还需要再重新把可读事件加回去，而这需要通过一个系统调用完成，在 LT 模式下待写队列大概率经常会有一些堆积，但是通常能很快被客户端读走，所以绝大部分时候不会出现 OOM。如果使用你说的方式，则需要频繁地删除然后重新添加可读事件，性能肯定下降，这相当于是将某一个极端场景的修复成本强加到大部分正常场景上，这是很不划算的。还有，对于现实世界来说，性能问题的解决往往不能都丢给服务端，这样只会让服务端的实现越来越复杂，为了覆盖每一个业务问题而导致代码越来膨胀，这通常也是一个不好的实践。拿这个客户端发送快读取慢的问题来说，除了服务端可以使用上述的方法缓解之外，更重要的是对客户端进行改造，平衡好生产和消费的速率，毕竟一个疯狂发送数据而不接收处理数据的客户端有什么实际价值呢？在我看来这种客户端只有一种功能：做性能压测探一探服务端的极限边界，除此之外并没有现实的业务价值。而为了一个没有实际应用价值的极端场景去做一个会降低性能的妥协方案，则更没有价值了。

Sanitize invalid TCP_KEEPINTVL and simplify TCP_KEEPALIVE_ABORT_THRESHOLD on Solaris

Ping @oranagra

[Bug]: concurrent map write and read

Is that all the fatal log? It contains within it all the goroutines accessing the map. Also, could you provide a example showing how you configure and use `gnet`?

[Bug]: concurrent map write and read

Thanks for the new details. But, I have no clue about this at the moment. We may need more debugging. Could you try to use [`WithReusePort`](https://pkg.go.dev/github.com/panjf2000/gnet/v2#WithReusePort) or downgrade to v2.4.2...

[Bug]: concurrent map write and read

Did you ever call any APIs that are not concurrency-safe outside the event loop?

[Bug]: concurrent map write and read

Check out the doc [gnet.Conn](https://pkg.go.dev/github.com/panjf2000/gnet/v2#Conn) and see if you've called some ***non-concurrency-safe*** APIs outside the event loops (in other words, you must call the non-concurrency-safe methods like `gnet.Conn`.`Read()`/`Next()`/`Peek()`/`Write()`, etc. inside...

Andy Pan

opt: improve the performance of sending data

opt: improve the performance of sending data

Sanitize invalid TCP_KEEPINTVL and simplify TCP_KEEPALIVE_ABORT_THRESHOLD on Solaris

[Bug]: concurrent map write and read

[Bug]: concurrent map write and read

[Bug]: concurrent map write and read

[Bug]: concurrent map write and read

[Bug]: concurrent map write and read

fix(pool): ensure proper lock handling in retrieveWorker

Use epoll_pwait2() if available