gnet icon indicating copy to clipboard operation
gnet copied to clipboard

[Bug]: 运行正常情况下突然出现日志

Open zhj0811 opened this issue 3 years ago • 18 comments

Actions I've taken before I'm here

  • [X] I've thoroughly read the documentations on this issue but still have no clue.
  • [X] I've searched the current list of Github issues but didn't find any duplicate issues that have been solved.
  • [X] I've searched the internet with this issue, but haven't found anything helpful.

What happened?

gnet 监听udp服务,打印日志[email protected]/reactor_default_linux.go:124 event-loop(6) is exiting due to error: server is going to be shutdown;随后进程终止

Major version of gnet

v2

Specific version of gnet

v2.1.2

Operating system

Linux

Relevant log output

[email protected]/reactor_default_linux.go:124  event-loop(6) is exiting due to error: server is going to be shutdown

Code snippets (optional)

No response

How to Reproduce

没特殊操作,正常运行的简单udp接收服务

Does this issue reproduce with the latest release?

It can reproduce with the latest release

zhj0811 avatar Jan 12 '23 10:01 zhj0811

频繁发生?

panjf2000 avatar Jan 12 '23 11:01 panjf2000

有看到 error occurs in event-loop: 的错误日志吗?没有的话应该是你在 OnTraffic() 里 return Shutdown 导致的。

panjf2000 avatar Jan 12 '23 11:01 panjf2000

频繁发生?

发生概率不频繁,运行3天后才出现的

zhj0811 avatar Jan 13 '23 01:01 zhj0811

有看到 error occurs in event-loop: 的错误日志吗?没有的话应该是你在 OnTraffic() 里 return Shutdown 导致的。

日志文件仅包含[email protected]/reactor_default_linux.go:124 event-loop(6) is exiting due to error: server is going to be shutdown 2023/01/12 16:48:45 Init config success. 这条日志。Ontraffic 无return shutdown image

zhj0811 avatar Jan 13 '23 01:01 zhj0811

OnClose() 和 OnTick() 里有没有返回 Shutdown?或者其他可能的地方,你全局搜一下 Shutdown 看看?一般来说这个错误是人为触发的,你看看你自己的业务代码有没有哪里有错误日志?

panjf2000 avatar Jan 13 '23 01:01 panjf2000

自己写的部分全局都没使用Shutdown image

zhj0811 avatar Jan 13 '23 01:01 zhj0811

image 日志都是正常的,就多了一行异常日志

zhj0811 avatar Jan 13 '23 01:01 zhj0811

OnClose() 和 OnTick() 里有没有返回 Shutdown?或者其他可能的地方,你全局搜一下 Shutdown 看看?一般来说这个错误是人为触发的,你看看你自己的业务代码有没有哪里有错误日志?

有类似Kill命令会触发Shutdown吗

zhj0811 avatar Jan 13 '23 03:01 zhj0811

暂时想不到还有什么情况会出现这种情况。你能提供一下能复现这个问题的 demo 代码吗?

panjf2000 avatar Jan 16 '23 14:01 panjf2000

暂时想不到还有什么情况会出现这种情况。你能提供一下能复现这个问题的 demo 代码吗?

type netServer struct {
	gnet.BuiltinEventEngine
	eng       gnet.Engine
	network   string
	addr      string
	multicore bool
}

func (s *netServer) OnBoot(eng gnet.Engine) gnet.Action {
	logger.Infof("running server on %s with multi-core=%t", fmt.Sprintf("%s://%s", s.network, s.addr), s.multicore)
	s.eng = eng
	return gnet.None
}

func (s *netServer) OnOpen(c gnet.Conn) ([]byte, gnet.Action) {
	logger.Debugf("connected with fd: %d, remote_addr: %s\n", c.Fd(), c.RemoteAddr().String())
	//c.SetContext(new(protocol.NetProtoCodec))
	return nil, gnet.None
}

func (s *netServer) OnClose(c gnet.Conn, err error) (action gnet.Action) {
	if err != nil {
		logger.Errorf("error occurred on fd: %d, remote_addr: %s, %v\n", c.Fd(), c.RemoteAddr().String(), err)
	}

	return gnet.Close
}

func (s *netServer) OnTraffic(c gnet.Conn) (action gnet.Action) {

	codec := new(NetProtoHeader)
	data, err := codec.Decode(c)
	if err != nil {
		logger.Errorf("invalid packet: %v", err)
		return gnet.Close

	}

	logger.Infof("gnet codec: %v", codec)
	err = ants.Submit(func() {
		switch codec.MsgType {
		case MsgTypePTRQuery:

			req := new(PTRReq)
			err := req.Unpack(data)
			if err != nil {
				logger.Errorf("Query ptr invalid packet: %s", err.Error())
				return
			}
			logger.Infof("PTR request: %+v", req)
			payload, err := singlePTR(req)
			if err != nil {
				logger.Errorf("Query ptr record by ip %s failed: %s", req.Qip.IpAddr.String(), err.Error())
				//return
				header := NetProtoHeader{MsgType: MsgTypePTRRes}
				resBody := DNSProtoSignalRes{
					DstIp:     req.DstIp,
					ProtoBody: []byte{0x00, 0x00},
				}
				payload = header.Encode(resBody.Pack())
			}
			sendSignalReqChan <- signalReq{Dst: req.DstIp.IpAddr.String(), Type: "ptr response", Data: payload}

		case MsgTypeTypeRes:
			//ants.Submit(func(){
			var ip net.IP
			switch len(data) {
			case 5 + 4, 17 + 4:
				ip = data[1 : len(data)-4]
			default:
				logger.Errorf("invalid payload body length: %d", len(data))
				return
			}
			logger.Infof("AIO addr: %s", ip.String())
			key := fmt.Sprintf("%s,%d", ip.String(), MsgTypeTypeRes)
			fn := func() (interface{}, error) {
				cType := binary.BigEndian.Uint16(data[len(data)-4:])
				ch, ok := ip2CType.Get(ip.String())
				if !ok {
					return nil, fmt.Errorf("invalid ip %s to conn type channel", ip.String())
				}
				ch <- cType
				logger.Infof("Ip %s connection type %d", ip.String(), cType)
				return nil, nil
			}
			_, err, _ := singleGroup.Do(key, fn)
			if err != nil {
				logger.Warnf("Handle ip %s type response failed %s", ip.String(), err.Error())
				return
			}
		default:
			logger.Errorf("invalid payload type: %d", codec.MsgType)
			return
		}
	})
	if err != nil {
		logger.Errorf("Submits a task to ants pool failed %s", err.Error())
	}
	return
}

func InitPtrServer() {
	port := viper.GetInt("ptr.port")
	if port == 0 {
		port = 30053
	}
	service := &netServer{
		network:   "udp",
		addr:      fmt.Sprintf(":%d", port),
		multicore: true,
	}
	err := gnet.Run(service, service.network+"://"+service.addr, gnet.WithMulticore(service.multicore), gnet.WithLogger(logger))
	if err != nil {
		logger.Errorf("running server on %s with multi-core=%t failed", fmt.Sprintf("%s://%s", service.network, service.addr), service.multicore)
		panic(err)
	}
}


func InitPtrServer() {
	port := viper.GetInt("ptr.port")
	if port == 0 {
		port = 30053
	}
	service := &netServer{
		network:   "udp",
		addr:      fmt.Sprintf(":%d", port),
		multicore: true,
	}
	err := gnet.Run(service, service.network+"://"+service.addr, gnet.WithMulticore(service.multicore), gnet.WithLogger(logger))
	if err != nil {
		logger.Errorf("running server on %s with multi-core=%t failed", fmt.Sprintf("%s://%s", service.network, service.addr), service.multicore)
		panic(err)
	}
}

zhj0811 avatar Feb 02 '23 07:02 zhj0811

上面的就是项目中关于gnet的全部代码了

zhj0811 avatar Feb 02 '23 08:02 zhj0811

gnet服务启动是在一个协程内启动的,单独的gnet shutdown会导致整个进程断掉吗

zhj0811 avatar Feb 02 '23 08:02 zhj0811

gnet服务启动是在一个协程内启动的,单独的gnet shutdown会导致整个进程断掉吗

什么意思?你是主动调用了 gnet.Stop 方法吗?

panjf2000 avatar Feb 05 '23 10:02 panjf2000

gnet服务启动是在一个协程内启动的,单独的gnet shutdown会导致整个进程断掉吗

什么意思?你是主动调用了 gnet.Stop 方法吗?

没主动调用 gnet.Stop方法,项目设计的gnet部分代码全部在上面了,上面的 InitPtrServer() 是 go InitPtrServer() 通过协程方式启动的,[email protected]/reactor_default_linux.go:124 event-loop(6) is exiting due to error: server is going to be shutdown 日志会导致整个进程断掉吗

zhj0811 avatar Feb 06 '23 01:02 zhj0811

gnet服务启动是在一个协程内启动的,单独的gnet shutdown会导致整个进程断掉吗

什么意思?你是主动调用了 gnet.Stop 方法吗?

没主动调用 gnet.Stop方法,项目设计的gnet部分代码全部在上面了,上面的 InitPtrServer() 是 go InitPtrServer() 通过协程方式启动的,[email protected]/reactor_default_linux.go:124 event-loop(6) is exiting due to error: server is going to be shutdown 日志会导致整个进程断掉吗

不会使得整个进程退出,只是使 gnet.Run() 结束而已,而且如果发生了 server is going to be shutdown 错误,应该也要打印所有 eventloop 的退出日志,而你这里就只打印了其中一个,这整个过程都太奇怪了,后来还有出现过相同的问题吗?

panjf2000 avatar Feb 06 '23 04:02 panjf2000

后面没出现过这种日志

zhj0811 avatar Feb 06 '23 06:02 zhj0811

我准备优化一下错误打印这部分代码,把相应的堆栈信息也一起打印出来,这样后续定位问题更准确,至于你这个情况,麻烦你持续观察下,如果后面还有复现再一起来看看,谢谢。

panjf2000 avatar Feb 06 '23 07:02 panjf2000

好的,问题复现再追踪下

zhj0811 avatar Feb 08 '23 07:02 zhj0811

看起来这个问题很长时间没有复现了,暂时关闭这个 issue。如果再复现可以随时重新打开。

panjf2000 avatar Jul 01 '24 00:07 panjf2000

🤖 Non-English text detected, translating...


It seems that this problem has not recurred for a long time, so we will temporarily close this issue. You can reopen it at any time if it happens again.

gh-translator avatar Jul 01 '24 00:07 gh-translator