plc4x
plc4x copied to clipboard
[Serial-Transport] NettyChannelFactory.closeEventLoopForChannel never returns
I'm polling a modbus device over a RS485 serial connection:
... connectionString":"modbus-rtu:serial:///dev/ttyAMA2?BAUD_RATE=19200&STOP_BITS=1&PARITY=EVEN&unit-identifier=1"
...
"sources. ... .pollReturnInterval": 5000,
"jobs. ... .interval": 2000,
It works for a few minutes and then:
[2022-10-28 17:20:50,239] WARN [conveyor12source|task-0] Request finished with exception. Reporting
Connection as Broken (org.apache.plc4x.java.utils.connectionpool2.CachedPlcConnection:203)
java.util.concurrent.CancellationException
at java.base/java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2396)
at org.apache.plc4x.java.utils.connectionpool2.CachedPlcConnection.lambda$wrapReadWithTimeout$1(CachedPlcConnection.java:105)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
[2022-10-28 17:20:50,603] INFO [conveyor12source|task-0]
Channel is closed, closing worker Group also (org.apache.plc4x.java.spi.connection.NettyChannelFactory:150)
[2022-10-28
17:20:50,614] INFO [conveyor12source|task-0] Worker Group was closed successfully! (org.apache.plc4x.java.spi.connection.NettyChannelFactory:155)
[2022-10-28
17:20:50,625] WARN [conveyor12source|task-0] Selector.select() returned prematurely 512 times in a row;
rebuilding Selector org.apache.plc4x.java.transport.serial.SerialPollingSelector@1a6dc4ab. (io.netty.channel.nio.NioEventLoop:628)
[2022-10-28
17:20:50,626] INFO [conveyor12source|task-0] Migrated 1 channel(s) to the new Selector. (io.netty.channel.nio.NioEventLoop:499)
[2022-10-28
17:20:50,637] WARN [conveyor12source|task-0] Selector.select() returned prematurely 512 times in a row;
rebuilding Selector org.apache.plc4x.java.transport.serial.SerialPollingSelector@2107b3db. (io.netty.channel.nio.NioEventLoop:628)
The Migrated 1 channel(s) and Selector.select() log entires continues endless.
I've found that the call in NettyChannelFactory to:
@Override
public void closeEventLoopForChannel(Channel channel) {
if (eventLoops.containsKey(channel))
{
logger.info("Channel is closed, closing worker Group also");
EventLoopGroup
eventExecutors = eventLoops.get(channel);
eventLoops.remove(channel);
eventExecutors.shutdownGracefully().awaitUninterruptibly();
logger.info("Worker Group was closed successfully!");
} else {
logger.warn("Trying
to remove EventLoop for Channel {} but have none stored", channel);
}
}
eventExecutors.shutdownGracefully() never returns even though the above log indicates it did. I've changed the call to:
eventExecutors.shutdownGracefully().addListener(e->{ LoggerFactory.getLogger(NettyChannelFactory.class).info("closeEventLoopForChannel:
Worker group finally shut down"); } );
in an attempt to eliminate any possible deadlocking.
Any suggestions would be much appreciated.
Imported from Jira PLC4X-358. Original Jira may contain additional context. Reported by: Kielsholm.
try my fork: https://github.com/spnettec/plc4x
Anything in that fork, that you would like to contribute back to the project?
We have the same issue for Modbus TCP. Is there something new for a fix of the bug ?
We added a timeout: workerGroup.shutdownGracefully().awaitUninterruptibly(2000);
Which version of PLC4J are you running on? and if possible can you provide a PCAP ?
We use PLC4J 0.11.0. Can't provide a PCAP.
@chrisdutz Does something speak against adding a timeout for the graceful shutdown of the worker groups as suggested by @BioComputerSience ?
No it doesn't ... feel free to whip up a PR.
The PR is open. When will it be released?
Can't find a PR ... would you mind posting a link to this?
But regarding timelines ... I no longer have much time on PLC4X (More on that here https://github.com/chrisdutz/blog/blob/main/plc4x/throwing-the-towel.adoc) ... so I guess there won't be a big release any time soon, the way I currently see it unless someone from the community does it.
https://github.com/apache/plc4x/pull/1701
Well that is already merged ... so it's in ... you just need to build it yourself.
@chrisdutz As far as I can see the PR was just closed instead of merged. I don't see the relevant changes on "develop" branch.
You are absolutely right ... I wanted to merge it, not sure why I clicked "Close" ... thanks for pointing that out ... and possibly explains why I claimed to have merged it ;-)
@chrisdutz No problem. Thanks for the merge ;-)