vert.x icon indicating copy to clipboard operation
vert.x copied to clipboard

java.io.IOException: Too many open files

Open fp24 opened this issue 6 years ago • 13 comments

hi, yesterday, i just got an error open too many files, but i checked all fd limit configuration, still can't figure it out, here is some error log:

2018-12-17 18:07:05.872 WARN 26878 --- [ceptor-thread-0] io.netty.channel.DefaultChannelPipeline : An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.

java.io.IOException: Too many open files at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) ~[na:1.8.0_131] at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422) ~[na:1.8.0_131] at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250) ~[na:1.8.0_131] at io.netty.util.internal.SocketUtils$5.run(SocketUtils.java:110) ~[netty-common-4.1.19.Final.jar!/:4.1.19.Final] at io.netty.util.internal.SocketUtils$5.run(SocketUtils.java:107) ~[netty-common-4.1.19.Final.jar!/:4.1.19.Final] at java.security.AccessController.doPrivileged(Native Method) ~[na:1.8.0_131] at io.netty.util.internal.SocketUtils.accept(SocketUtils.java:107) ~[netty-common-4.1.19.Final.jar!/:4.1.19.Final] at io.netty.channel.socket.nio.NioServerSocketChannel.doReadMessages(NioServerSocketChannel.java:141) ~[netty-transport-4.1.19.Final.jar!/:4.1.19.Final] at io.netty.channel.nio.AbstractNioMessageChannel$NioMessageUnsafe.read(AbstractNioMessageChannel.java:75) ~[netty-transport-4.1.19.Final.jar!/:4.1.19.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:645) [netty-transport-4.1.19.Final.jar!/:4.1.19.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580) [netty-transport-4.1.19.Final.jar!/:4.1.19.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497) [netty-transport-4.1.19.Final.jar!/:4.1.19.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:451) [netty-transport-4.1.19.Final.jar!/:4.1.19.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:886) [netty-common-4.1.19.Final.jar!/:4.1.19.Final] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-common-4.1.19.Final.jar!/:4.1.19.Final] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]


2018-12-17 14:30:12.842 WARN 26878 --- [entExecutor-1-6] io.netty.util.concurrent.DefaultPromise : An exception was thrown by io.vertx.core.net.impl.ChannelProvider$$Lambda$87/93332981.operationComplete()


some system config

cat /etc/security/limits.conf

  •          soft    nofile          2000000
    
  •          hard    nofile          2000000
    

ulimit -n 2000000

lsof -p pid 5000-


fp24 avatar Dec 18 '18 02:12 fp24

@vietj please help me out, thanks.

fp24 avatar Dec 18 '18 02:12 fp24

can you provide a reprducer @fp24 ? which version is it ?

vietj avatar Dec 18 '18 06:12 vietj

@vietj I am also facing the same issue. it seems that vertx thread are not getting closed. As when we started the server the lsof -p pid was 656, after two days the count reaches to 8637 and keep on increasing. We are using Vertx version: 3.4.2 Netty: 4.1.8 Final

Log: java.io.IOException: Too many open files at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) ~[?:1.8.0_192] at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422) ~[?:1.8.0_192] at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250) ~[?:1.8.0_192] at io.netty.util.internal.SocketUtils$5.run(SocketUtils.java:110) ~[netty-common-4.1.8.Final.jar:4.1.8.Final] at io.netty.util.internal.SocketUtils$5.run(SocketUtils.java:107) ~[netty-common-4.1.8.Final.jar:4.1.8.Final] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_192] at io.netty.util.internal.SocketUtils.accept(SocketUtils.java:107) ~[netty-common-4.1.8.Final.jar:4.1.8.Final] at io.netty.channel.socket.nio.NioServerSocketChannel.doReadMessages(NioServerSocketChannel.java:141) ~[netty-transport-4.1.8.Final.jar:4.1.8.Final] at io.netty.channel.nio.AbstractNioMessageChannel$NioMessageUnsafe.read(AbstractNioMessageChannel.java:75) [netty-transport-4.1.8.Final.jar:4.1.8.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:642) [netty-transport-4.1.8.Final.jar:4.1.8.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:565) [netty-transport-4.1.8.Final.jar:4.1.8.Final] at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:479) [netty-transport-4.1.8.Final.jar:4.1.8.Final] at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:433) [netty-transport-4.1.8.Final.jar:4.1.8.Final] at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858) [netty-common-4.1.8.Final.jar:4.1.8.Final] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_192]

Plus i can see in lsof -p pid has so many eventpoll and pipe open for different UDP and not getting closed.

java 502555 rdr_st5 175u IPv4 122155770 0t0 UDP *:52768 java 502555 rdr_st5 176r FIFO 0,8 0t0 122338426 pipe java 502555 rdr_st5 177w FIFO 0,8 0t0 122338426 pipe java 502555 rdr_st5 178r FIFO 0,8 0t0 122149321 pipe java 502555 rdr_st5 179w FIFO 0,8 0t0 122149321 pipe java 502555 rdr_st5 180u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 181r FIFO 0,8 0t0 122149322 pipe java 502555 rdr_st5 182w FIFO 0,8 0t0 122149322 pipe java 502555 rdr_st5 183u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 184r FIFO 0,8 0t0 122149323 pipe java 502555 rdr_st5 185w FIFO 0,8 0t0 122149323 pipe java 502555 rdr_st5 186u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 187r FIFO 0,8 0t0 122149324 pipe java 502555 rdr_st5 188w FIFO 0,8 0t0 122149324 pipe java 502555 rdr_st5 189u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 190r FIFO 0,8 0t0 122149325 pipe java 502555 rdr_st5 191w FIFO 0,8 0t0 122149325 pipe java 502555 rdr_st5 192u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 193r FIFO 0,8 0t0 122149326 pipe java 502555 rdr_st5 194w FIFO 0,8 0t0 122149326 pipe java 502555 rdr_st5 195u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 196r FIFO 0,8 0t0 122149327 pipe java 502555 rdr_st5 197w FIFO 0,8 0t0 122149327 pipe java 502555 rdr_st5 198u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 199r FIFO 0,8 0t0 122149328 pipe java 502555 rdr_st5 200w FIFO 0,8 0t0 122149328 pipe java 502555 rdr_st5 201u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 202r FIFO 0,8 0t0 122149329 pipe java 502555 rdr_st5 203w FIFO 0,8 0t0 122149329 pipe java 502555 rdr_st5 204u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 205u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 206u IPv4 122149331 0t0 UDP *:40091 java 502555 rdr_st5 207r FIFO 0,8 0t0 122338427 pipe java 502555 rdr_st5 208w FIFO 0,8 0t0 122338427 pipe java 502555 rdr_st5 209u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 210r FIFO 0,8 0t0 122338428 pipe java 502555 rdr_st5 211w FIFO 0,8 0t0 122338428 pipe java 502555 rdr_st5 212u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 213r FIFO 0,8 0t0 122338429 pipe java 502555 rdr_st5 214w FIFO 0,8 0t0 122338429 pipe java 502555 rdr_st5 215u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 216r FIFO 0,8 0t0 122338430 pipe java 502555 rdr_st5 217w FIFO 0,8 0t0 122338430 pipe java 502555 rdr_st5 218u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 219r FIFO 0,8 0t0 122338431 pipe java 502555 rdr_st5 220w FIFO 0,8 0t0 122338431 pipe java 502555 rdr_st5 221u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 222r FIFO 0,8 0t0 122338432 pipe java 502555 rdr_st5 223w FIFO 0,8 0t0 122338432 pipe java 502555 rdr_st5 224u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 225r FIFO 0,8 0t0 122338433 pipe java 502555 rdr_st5 226w FIFO 0,8 0t0 122338433 pipe java 502555 rdr_st5 227u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 228r FIFO 0,8 0t0 122338434 pipe java 502555 rdr_st5 229w FIFO 0,8 0t0 122338434 pipe java 502555 rdr_st5 230u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 231r FIFO 0,8 0t0 126263781 pipe java 502555 rdr_st5 232u IPv4 122338436 0t0 UDP *:38325 java 502555 rdr_st5 233r FIFO 0,8 0t0 123010589 pipe java 502555 rdr_st5 234r FIFO 0,8 0t0 123010631 pipe java 502555 rdr_st5 235r FIFO 0,8 0t0 122426876 pipe java 502555 rdr_st5 236u IPv4 122340294 0t0 UDP *:38737 java 502555 rdr_st5 237w FIFO 0,8 0t0 122426876 pipe java 502555 rdr_st5 238r FIFO 0,8 0t0 122426842 pipe java 502555 rdr_st5 239w FIFO 0,8 0t0 122426842 pipe java 502555 rdr_st5 240r FIFO 0,8 0t0 122426853 pipe java 502555 rdr_st5 241u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 242r FIFO 0,8 0t0 122426843 pipe java 502555 rdr_st5 243w FIFO 0,8 0t0 122426843 pipe java 502555 rdr_st5 244u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 245r FIFO 0,8 0t0 122426844 pipe java 502555 rdr_st5 246w FIFO 0,8 0t0 122426844 pipe java 502555 rdr_st5 247u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 248r FIFO 0,8 0t0 122426845 pipe java 502555 rdr_st5 249w FIFO 0,8 0t0 122426845 pipe java 502555 rdr_st5 250u REG 0,9 0 4021 [eventpoll] java 502555 rdr_st5 251r FIFO 0,8 0t0 122426846 pipe java 502555 rdr_st5 252w FIFO 0,8 0t0 122426846 pipe java 502555 rdr_st5 253u REG 0,9 0 4021 [eventpoll]

vsrivastav avatar Jun 18 '19 07:06 vsrivastav

@vsrivastav can you try with a more recent version ?

vietj avatar Jun 18 '19 12:06 vietj

I have seen a similar issue multiple times. It happens after the server runs for a while (like a week or so). Then this error bring the service down.

An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.
java.io.IOException: Too many open files
        at java.base/sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
        at java.base/sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:533)
        at java.base/sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:285)
        at io.netty.util.internal.SocketUtils$5.run(SocketUtils.java:110)
        at io.netty.util.internal.SocketUtils$5.run(SocketUtils.java:107)
        at java.base/java.security.AccessController.doPrivileged(Native Method)
        at io.netty.util.internal.SocketUtils.accept(SocketUtils.java:107)
        at io.netty.channel.socket.nio.NioServerSocketChannel.doReadMessages(NioServerSocketChannel.java:143)
        at io.netty.channel.nio.AbstractNioMessageChannel$NioMessageUnsafe.read(AbstractNioMessageChannel.java:75)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:450)
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:834)

FreeTymeKiyan avatar Jul 13 '19 07:07 FreeTymeKiyan

BTW it's version 3.6.2. Seems to be related to the jdbc driver as well?

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
        at com.mysql.cj.jdbc.exceptions.SQLError.createCommunicationsException(SQLError.java:174)
        at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:64)
        at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:832)
        at com.mysql.cj.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:456)
        at com.mysql.cj.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:240)
        at com.mysql.cj.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:207)
        at com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:175)
        at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:220)
        at com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:206)
        at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:203)
        at com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1138)
        at com.mchange.v2.resourcepool.BasicResourcePool.doAcquireAndDecrementPendingAcquiresWithinLockOnSuccess(BasicResourcePool.java:1125)
        at com.mchange.v2.resourcepool.BasicResourcePool.access$700(BasicResourcePool.java:44)
        at com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask.run(BasicResourcePool.java:1870)
        at com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:696)
Caused by: com.mysql.cj.exceptions.CJCommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
        at jdk.internal.reflect.GeneratedConstructorAccessor32.newInstance(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
        at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:61)
        at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:105)
        at com.mysql.cj.exceptions.ExceptionFactory.createException(ExceptionFactory.java:151)
        at com.mysql.cj.exceptions.ExceptionFactory.createCommunicationsException(ExceptionFactory.java:167)
        at com.mysql.cj.protocol.a.NativeSocketConnection.connect(NativeSocketConnection.java:91)
        at com.mysql.cj.NativeSession.connect(NativeSession.java:152)
        at com.mysql.cj.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:952)
        at com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:822)
        ... 12 more
Caused by: java.net.SocketException: Too many open files
        at java.base/java.net.Socket.createImpl(Socket.java:462)
        at java.base/java.net.Socket.getImpl(Socket.java:522)
        at java.base/java.net.Socket.setTcpNoDelay(Socket.java:980)
        at com.mysql.cj.protocol.StandardSocketFactory.configureSocket(StandardSocketFactory.java:94)
        at com.mysql.cj.protocol.StandardSocketFactory.connect(StandardSocketFactory.java:165)
        at com.mysql.cj.protocol.a.NativeSocketConnection.connect(NativeSocketConnection.java:65)
        ... 15 more

FreeTymeKiyan avatar Jul 13 '19 07:07 FreeTymeKiyan

can you use lsof command to investigate the file descriptors ?

vietj avatar Jul 13 '19 10:07 vietj

@vietj I tried it after running the server for a while. It seems like all of them are tcp connections. The number keeps growing over the two days. I'm guessing those idle connections have never been closed.

Other than adding setIdleTimeout when creating http server, is there anything else I can do? Like somehow manually close a connection?

FreeTymeKiyan avatar Jul 15 '19 05:07 FreeTymeKiyan

setting idle timeout is a good way to close such connections. If you have another way to clean such connections you can set a connectionHandler on an HttpServer, keep those connections in a map and close the connections directly. Clients should also somehow close idle connections...

vietj avatar Jul 15 '19 07:07 vietj

@vietj thanks for the reply! QQ: when you say clients, those clients are not necessarily the http client of vertx right? Other http client implementations should also be good as long as the connection is closed?

FreeTymeKiyan avatar Jul 16 '19 04:07 FreeTymeKiyan

yes clients in general

vietj avatar Jul 16 '19 07:07 vietj

I noticed that there is a similar problem at https://github.com/filip26/titanium-json-ld/issues/33, https://github.com/codecentric/spring-boot-admin/issues/1478, https://github.com/Linux74656/SpaceEngineersLinuxPatches/issues/35, https://github.com/bigbluebutton/bigbluebutton/issues/8809 and https://github.com/reactor/reactor-netty/issues/1163. Perhaps we can refer to these issues to find more context about the bug (e.g. the cause and workaround of the bug, or related library, or their steps to reproduce)

SylviaLee89 avatar Aug 20 '20 11:08 SylviaLee89

I have seen a similar issue multiple times too.
setIdleTimeout can solve it. sush as vertx.createHttpServer(new HttpServerOptions().setIdleTimeout(60)) When client is keep-alive, vertx don't close connection until client disconnect it, so it may happens after the server runs for a while. Because of client that want to keep-alive is too mush, but it's no need to keep-alive always. In my case, 60s is enough. this issue should be closed.

leewp avatar Jun 08 '22 05:06 leewp