ignite icon indicating copy to clipboard operation
ignite copied to clipboard

Server crash when ignite service is stopped on client node

Open alex401401 opened this issue 1 year ago • 4 comments

There is a cluster of 3 nodes. 1 server and 2 client nodes. The server is started with ignite.bat without parameters. The 2 client nodes are running as part of my dotnet applications. One (N1) runs ignite services and the other (N2) is a console application to control the services in the cluster. I.e. for starting, stopping and controlling the services. When on N2 I give a command to start a service - services.DeployNodeSingleton("TestService", TestServiceImpl), everything is fine - the service is started on N1 and works. But when I give a command to stop the service - services.Cancel("TestService"), an exception occurs on the server and it crashes. At the same time TestService is successfully stopped on N1.

Error occurring on the server:

[12:42:31,848][SEVERE][services-deployment-worker-#74%main-grid%][ServiceDeploymentTask] Error occurred while initializing deployment task, err=Cannot invoke "org.apache.ignite.internal.processors.service.ServiceInfo.name()" because "rmv" is null
java.lang.NullPointerException: Cannot invoke "org.apache.ignite.internal.processors.service.ServiceInfo.name()" because "rmv" is null
        at org.apache.ignite.internal.processors.service.IgniteServiceProcessor.lambda$updateDeployedServices$4(IgniteServiceProcessor.java:1599)
        at java.base/java.util.HashMap.forEach(HashMap.java:1421)
        at java.base/java.util.Collections$UnmodifiableMap.forEach(Collections.java:1553)
        at org.apache.ignite.internal.processors.service.IgniteServiceProcessor.updateDeployedServices(IgniteServiceProcessor.java:1594)
...

I looked at the source code for the void updateDeployedServices(final ServiceDeploymentActions depActions) method in the modules/core/src/main/java/org/apache/ignite/internal/processors/service/IgniteServiceProcessor.java file. In the ServiceInfo line rmv = deployedServices.remove(srvcId); the deployedServices.remove(srvcId) method probably returns null and this is not handled in any way so an exception is thrown on deployedServicesByName.remove(rmv.name())

alex401401 avatar Sep 02 '24 08:09 alex401401

Looks like a bug.

I wonder if having a mix of .NET and Java nodes causes this.

Could you please try using Apache.Ignite.exe instead of ignite.bat to start the server node and see if the bug still occurs?

ptupitsyn avatar Sep 02 '24 09:09 ptupitsyn

We did not find Apache.Ignite.exe in the Ignite 2.16.0 distibutive. So we run it in this way: dotnet Apache.Ignite.Executable.dll. Everything is fine here. There is no error.

alex401401 avatar Sep 02 '24 11:09 alex401401

Thank you for the confirmation. Yes, dotnet Apache.Ignite.Executable.dll is correct.

Ticket created: https://issues.apache.org/jira/browse/IGNITE-23123

ptupitsyn avatar Sep 02 '24 13:09 ptupitsyn

The error is also reproduced on a simpler variant of 1 java server and 1 NET thick client. On the client we run the following code: var cluster = Ignite.GetCluster().ForDotNet(); IServices services = cluster.GetServices(); services.DeployNodeSingleton(‘TestService’, new TestServiceImpl()); services.Cancel(‘TestService’);

alex401401 avatar Sep 03 '24 04:09 alex401401