note
note copied to clipboard
JDK 1.7及以下 NIO 的epoll bug
JDK NIO的臭名昭著的epoll bug,它会导致Selector空轮询,最终导致CPU 100%。官方声称在JDK1.6版本的update18修复了该问题,但是直到JDK1.7版本该问题仍旧存在,只不过该bug发生概率降低了一些而已,它并没有被根本解决。该BUG发生后会导致CPU突然占用1000%以上,示例的堆栈如下:
"NettyClientWorker-thread-7" daemon prio=10 tid=0x00007ffbe80c3800 nid=0x3539 runnable [0x00007ffc2cccb000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x0000000682334a28> (a sun.nio.ch.Util$2)
- locked <0x0000000682334a18> (a java.util.Collections$UnmodifiableSet)
- locked <0x0000000682333610> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
at org.jboss.netty.channel.socket.nio.SelectorUtil.select(SelectorUtil.java:52)
at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:200)
at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:38)
at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
再举一个示例:
"localhost-startStop-1-SendThread(l-zk3.plat.cn6.com:2181)" daemon prio=10 tid=0x00007ffbe4469000 nid=0x3520 runnable [0x00007ffc2e5e4000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000006823c1908> (a sun.nio.ch.Util$2)
- locked <0x00000006823c18f8> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000006823bfd50> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:338)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
BUG相关地址:
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6403933
http://bugs.java.com/bugdatabase/view_bug.do?bug_id=2147719
类似这种日志一定是这个BUG?我也遇到了,不知道是不是
"localhost-startStop-1-SendThread(l-zk3.plat.cn6.com:2181)" daemon prio=10 tid=0x00007ffbe4469000 nid=0x3520 runnable [0x00007ffc2e5e4000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
- locked <0x00000006823c1908> (a sun.nio.ch.Util$2)
- locked <0x00000006823c18f8> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000006823bfd50> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
什么条件下复现的?能交流下么
出现这个堆栈不一定都是问题,你这的 CPU 使用率多少?
整个进程40%左右,有些20%,那可能不是这个问题了
wo遇到这个问题了,但是cpu占用率不是100%,而是基本没有占用,只是线程还活着。怎么解诀
在用NIO的时候也遇到了此问题,大量cpu: "$_NIOREACTOR-0-RW" #27 prio=5 os_prio=0 tid=0x00007f3e14102000 nid=0xb6d runnable [0x00007f3e7471c000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x00000006400804e8> (a sun.nio.ch.Util$3) - locked <0x00000006400801d0> (a java.util.Collections$UnmodifiableSet) - locked <0x000000064007e8c0> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at io.mycat.net.NIOReactor$RW.run(NIOReactor.java:93) at java.lang.Thread.run(Thread.java:745) linux的cpu context切换非常频繁,cpu疲于切换,业务处理很少
您好,请教一下,这种情况下,您是怎么处理的,我现在是1.8.0_45。
我的线程log: "main-SendThread(10.200.56.241:2181)" #25 daemon prio=5 os_prio=0 tid=0x00007ff5307c9800 nid=0xf06a runnable [0x00007ff4a1439000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:79) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x00000000faf3ee30> (a sun.nio.ch.Util$2) - locked <0x00000000faf3ee48> (a java.util.Collections$UnmodifiableSet) - locked <0x00000000faf7eb50> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:349) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1081)
我的jdk版本是 1.8.0_171,项目用的springboot 版本是 1.5.13.RELEASE,,web容器用的是springboot内置的tomcat容器。 最近项目市场特别卡,服务请求时间特别长,CPU占用大。 通过jstack收集的日志跟上面的日志类似: "NioBlockingSelector.BlockPoller-1" #39 daemon prio=5 os_prio=0 tid=0x00007f6a0e346000 nid=0xa23 runnable [0x00007f69eb8f9000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86) - locked <0x00000005d1095198> (a sun.nio.ch.Util$3) - locked <0x00000005d1095188> (a java.util.Collections$UnmodifiableSet) - locked <0x00000005d10951a8> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97) at org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSelector.java:298)
Locked ownable synchronizers: - None
请问楼主怎么修复这个问题?