Chronicle-Queue icon indicating copy to clipboard operation
Chronicle-Queue copied to clipboard

Failed with java.lang.IllegalStateException: Unable to claim exclusive exclusive lock on file

Open NaveenNatarajan97 opened this issue 2 years ago • 3 comments

We are trying to use Chronicle Queue with the below-mentioned versions:-

  1. chronicle-threads-2.21.87.jar
  2. chronicle-bytes-2.21.93.jar
  3. chronicle-core-2.21.95.jar
  4. chronicle-wire-2.21.93.jar
  5. chronicle-queue-5.21.96.jar

Below is the error message we are encountering.

java.lang.IllegalStateException: Unable to claim exclusive exclusive lock on file ./ChronicleTesting/124124queue65/metadata.cq4t
        at net.openhft.chronicle.queue.impl.table.SingleTableStore.doWithLock(SingleTableStore.java:160)
        at net.openhft.chronicle.queue.impl.table.SingleTableStore.doWithExclusiveLock(SingleTableStore.java:125)
        at net.openhft.chronicle.queue.impl.table.SingleTableBuilder.build(SingleTableBuilder.java:137)
        at net.openhft.chronicle.queue.impl.single.SingleChronicleQueueBuilder.initializeMetadata(SingleChronicleQueueBuilder.java:444)
        at net.openhft.chronicle.queue.impl.single.SingleChronicleQueueBuilder.preBuild(SingleChronicleQueueBuilder.java:1077)
        at net.openhft.chronicle.queue.impl.single.SingleChronicleQueueBuilder.build(SingleChronicleQueueBuilder.java:327)
        at Main.run(Main.java:65)
        at java.lang.Thread.run(Thread.java:748)

To reproduce this issue, we are attaching a Standalone Utility for the above case.

About Standalone Utility:- This Standalone utility will Write, Read, and Clear the queue, and go to sleep. and continues 'n' number of iteration.

Command line argument:- java -Xmx10240M -jar ChronicleQueueGenerator-WriteReadClear.jar 25 200 1000 30 1000

Syntax:- java -Xmx10240M -jar ChronicleQueueGenerator-WriteReadClear.jar <arg_1> <arg_2> <arg_3> <arg_4> <arg_5>

Argument List:- Arg_1: No threads. Arg_2: No queues for each thread. Arg_3: No of sleep duration (in ms). Arg_4: No of bytes allocated for each queue content that needs to be populated. Arg_5: No of iterations the write, read, and clean need to happen.

Example with explaination:- java -Xmx10240M -jar ChronicleQueueGenerator-WriteReadClear.jar 25 200 1000 30 1000

1GB of heap space. No. of. Threads:- 25 No. of Queues:- 200 No. of sleep duration:- 1000ms Bytes allocated:- 30 bytes No of iteration:- 1000

Queues will be created under the ChronicleTesting directory, where the jar is being executed.

From the above example, 25 * 200 = 5000 Queues will be created totally under the ChronicleTesting directory.

Flow:-

  1. 25 Threads will be created.
  2. Each thread will create a 200 queues - First, they will write into a queue with allocated bytes. (Here, as 30bytes) - Read from the queue. - clear the queue.
  3. Then go to sleep. (Here, as 1000ms)
  4. Steps 2 & 3 will be repeated as N number of iterations. (Here, as 1000)
  5. The output message will be as Write and Read done for Thread: 1629 Iteration number : 0
  6. After this is completed, each thread will go on a never-ending loop.
  7. But, we are facing exclusive log issue.

Note:- We are looking for a solution. We came across this closed ticket (https://github.com/OpenHFT/Chronicle-Queue/issues/641). As per this comment (https://github.com/OpenHFT/Chronicle-Queue/issues/641#issuecomment-1735258742). We are attaching the Standalone Utility Jar to reproduce the issue consistently.

Requesting to take a look at this scenario.

Thanks in advance. Standalone Utility.tar.gz

NaveenNatarajan97 avatar Oct 03 '23 13:10 NaveenNatarajan97

@NaveenNatarajan97 can I suggest you try with the latest version to see if this still happens. If you can provide a (failing) unit test as part of a PR against the latest version then you have a much greater chance of this getting addressed.

If you need someone to spend the time to support you I suggest you take a look at the support page and contact Chronicle Software

JerryShea avatar Oct 03 '23 20:10 JerryShea

@JerryShea Thanks for the response.

We used the latest version 5.24ea26 and were able to reproduce the issue as well.

When trying to create a PR against the latest version. We debugged and noticed. There is a timeout in the SingleTableStore.java.

try (final FileChannel channel = FileChannel.open(file.toPath(), readOrWrite)) {
            for (int count = 1; System.currentTimeMillis() < timeoutAt; count++) {

The FileChannel.open took more than 10000ms which leads doWithLock method to throw IllegalStateException("Unable to claim exclusive " + type + " lock on file " + file)

We noticed there is an option to change the timeout Parameter by using "chronicle.table.store.timeoutMS".

We increased the timeout and we see less number of exceptions now.

Increasing the chronicle.table.store.timeoutMS helps us.

  • What will be the maximum value can we give here?
  • Will there be any other issue if we increase this parameter (chronicle.table.store.timeoutMS)?
  • Is there a way to disable this timeout?

NaveenNatarajan97 avatar Oct 06 '23 09:10 NaveenNatarajan97

Chronicle_Queue_Standalone_Utility.txt Attaching the source code of our standalone utility which we used to reproduce the issue.

Thanks in advance.

NaveenNatarajan97 avatar Oct 10 '23 09:10 NaveenNatarajan97