curator icon indicating copy to clipboard operation
curator copied to clipboard

[CURATOR-375] Fix thread interruption being reported twice

Open jira-importer opened this issue 8 years ago • 18 comments

When a curator operation thread is interrupted, some classes (PersistentNode ConnectionState primarily) report the interruption in two ways at the same time - by re-marking the thread interruption status and throwing InterruptedException - this makes it look like the thread has been interrupted twice, rather than once.


Originally reported by thecoop1984, imported from: Fix thread interruption being reported twice
  • assignee: randgalt
  • status: Open
  • priority: Major
  • resolution: Unresolved
  • imported: 2025-01-21

jira-importer avatar Jan 05 '17 14:01 jira-importer

githubbot:

GitHub user thecoop opened a pull request:

https://github.com/apache/curator/pull/186

CURATOR-375 - fix thread interruption being reported twice

In some classes, thread interruption was being reported twice - by setting the thread interrupted flag and re-throwing the `InterruptedException` - this made it look like the thread was actually interrupted twice, and generally causes spurious interruption - even if the `InterruptedException` is handled, the thread interruption status is still set, meaning the next async operation that happens will also fail.

Affected classes:

  • `ConnectionState` - `closeAndClear` can never throw `InterruptedException` canyway, so remove the check completely
  • `PersistentNode` - as mentioned in the docs for `AutoCloseable`, close methods shouldn't throw InterruptedException anyway. The delete operation is guaranteed, so even if `deleteNode` throws an exception, the node will still (probably) be deleted
  • `CreateBuilderImpl`/`DeleteBuilderImpl` - the exception is being re-thrown, so don't need to re-set the interrupt status

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/thecoop/curator CURATOR-375

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/curator/pull/186.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #186


commit 81edd7e07a1fae2412c3e42e74a31b992aa72669
Author: Simon Cooper
Date: 2017-01-05T14:08:17Z

CURATOR-375 - fix thread interruption being reported using thread interrupted flag and an exception at the same time


jira-importer avatar Jan 05 '17 14:01 jira-importer

randgalt:

Do you have a test or a trace/log that shows the problem?

jira-importer avatar Jan 08 '17 17:01 jira-importer

githubbot:

Github user cammckenzie commented on a diff in the pull request:

https://github.com/apache/curator/pull/186#discussion_r95088443

— Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/CreateBuilderImpl.java —
@@ -485,7 +485,6 @@ private String protectedPathInForeground(String adjustedPath, byte[] data) throw
}
catch ( Exception e)
{

  • ThreadUtils.checkInterrupted(e);
      • End diff –

Won't the removal of this call mean that the thread will not be marked as interrupted unless the caller catches the exception and resets the value themselves? This code is called from many recipes and can also be called explicitly by users Curator code. I would think that there's great potential to break code here, unless I'm missing something.

jira-importer avatar Jan 08 '17 21:01 jira-importer

githubbot:

Github user cammckenzie commented on the issue:

https://github.com/apache/curator/pull/186

Is this fix specifically for the PersistentNode class? It seems like it will potentially break a lot of stuff.

jira-importer avatar Jan 08 '17 21:01 jira-importer

githubbot:

Github user thecoop commented on a diff in the pull request:

https://github.com/apache/curator/pull/186#discussion_r95382951

— Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/CreateBuilderImpl.java —
@@ -485,7 +485,6 @@ private String protectedPathInForeground(String adjustedPath, byte[] data) throw
}
catch ( Exception e)
{

  • ThreadUtils.checkInterrupted(e);
      • End diff –

No, because the `InterruptedException` is rethrown in every case. Thread interruption status is represented by either the thread interrupt status or an `InterruptedException`, not both. This is mentioned in the Java tutorials at http://docs.oracle.com/javase/tutorial/essential/concurrency/interrupt.html:

> By convention, any method that exits by throwing an InterruptedException clears interrupt status when it does so.

jira-importer avatar Jan 10 '17 15:01 jira-importer

githubbot:

Github user thecoop commented on the issue:

https://github.com/apache/curator/pull/186

I first came across this issue in the context of `PersistentNode`, but it applies to the other classes in the PR

jira-importer avatar Jan 10 '17 15:01 jira-importer

thecoop1984:

There's no specific test case for this, but the current behaviour of curator violates the expectations at the bottom of http://docs.oracle.com/javase/tutorial/essential/concurrency/interrupt.html - thread interruption is reported either by the interrupt flag on the thread, or throwing InterruptedException, but not both.

The current behaviour means that, even if the `InterruptedException` is handled, the next time the thread calls `Thread.sleep` or similar, that will immediately throw `InterruptedException` too, even if no interruption has occurred between the two

jira-importer avatar Jan 10 '17 15:01 jira-importer

githubbot:

Github user Randgalt commented on a diff in the pull request:

https://github.com/apache/curator/pull/186#discussion_r95387047

— Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/CreateBuilderImpl.java —
@@ -485,7 +485,6 @@ private String protectedPathInForeground(String adjustedPath, byte[] data) throw
}
catch ( Exception e)
{

  • ThreadUtils.checkInterrupted(e);
      • End diff –

re-throwing InterruptedException does not reset the Thread interrupted status of the thread.

jira-importer avatar Jan 10 '17 15:01 jira-importer

randgalt:

There are no cases of Curator handling thread interruption that I know of. I don't understand what the harm of resetting the thread interrupted status is. Can you explain the issue?

jira-importer avatar Jan 10 '17 15:01 jira-importer

randgalt:

That same page you link to states: "By convention, any method that exits by throwing an InterruptedException clears interrupt status when it does so. However, it's always possible that interrupt status will immediately be set again, by another thread invoking interrupt."

jira-importer avatar Jan 10 '17 15:01 jira-importer

githubbot:

Github user thecoop commented on a diff in the pull request:

https://github.com/apache/curator/pull/186#discussion_r95403890

— Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/CreateBuilderImpl.java —
@@ -485,7 +485,6 @@ private String protectedPathInForeground(String adjustedPath, byte[] data) throw
}
catch ( Exception e)
{

  • ThreadUtils.checkInterrupted(e);
      • End diff –

The caller knows precisely because InterruptedException is being thrown - that is how the information is conveyed to the caller. Check http://docs.oracle.com/javase/tutorial/essential/concurrency/interrupt.html:
> By convention, any method that exits by throwing an InterruptedException clears interrupt status when it does so.

jira-importer avatar Jan 10 '17 16:01 jira-importer

githubbot:

Github user Randgalt commented on a diff in the pull request:

https://github.com/apache/curator/pull/186#discussion_r95404809

— Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/CreateBuilderImpl.java —
@@ -485,7 +485,6 @@ private String protectedPathInForeground(String adjustedPath, byte[] data) throw
}
catch ( Exception e)
{

  • ThreadUtils.checkInterrupted(e);
      • End diff –

Most code does not check for `InterruptedException` and this is the problem. Again, please explain what the issue being fixed is here. Is this merely an academic issue or is there a real problem?

jira-importer avatar Jan 10 '17 16:01 jira-importer

githubbot:

Github user thecoop commented on a diff in the pull request:

https://github.com/apache/curator/pull/186#discussion_r95409863

— Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/CreateBuilderImpl.java —
@@ -485,7 +485,6 @@ private String protectedPathInForeground(String adjustedPath, byte[] data) throw
}
catch ( Exception e)
{

  • ThreadUtils.checkInterrupted(e);
      • End diff –

I've just pushed a basic test case showing the issue. The thread is interrupted once, but it is exposed to the caller twice - once through close() throwing an exception, and once through whatever next checks the interrupt status (here, I've used Thread.sleep)

jira-importer avatar Jan 10 '17 17:01 jira-importer

githubbot:

Github user thecoop commented on a diff in the pull request:

https://github.com/apache/curator/pull/186#discussion_r95410902

— Diff: curator-framework/src/main/java/org/apache/curator/framework/imps/CreateBuilderImpl.java —
@@ -485,7 +485,6 @@ private String protectedPathInForeground(String adjustedPath, byte[] data) throw
}
catch ( Exception e)
{

  • ThreadUtils.checkInterrupted(e);
      • End diff –

As a correction to the previous comment, this fixes the issue by exposing the interruption using the interrupt status only, and not throwing an exception, rather than doing both as it is at the moment.

jira-importer avatar Jan 10 '17 17:01 jira-importer

githubbot:

Github user Randgalt commented on a diff in the pull request:

https://github.com/apache/curator/pull/186#discussion_r95412840

— Diff: curator-recipes/src/test/java/org/apache/curator/framework/recipes/nodes/TestPersistentNode.java —
@@ -136,4 +138,52 @@ public void testQuickCloseNodeExists() throws Exception
CloseableUtils.closeQuietly(client);
}
}
+
+ @Test
+ public void testInterruption() throws Exception
— End diff –

I don't understand what this is testing. What does it matter how many times a thread is interrupted? I feel like this is an academic exercise. Is there a real problem here?

jira-importer avatar Jan 10 '17 17:01 jira-importer

githubbot:

Github user thecoop commented on a diff in the pull request:

https://github.com/apache/curator/pull/186#discussion_r95420664

— Diff: curator-recipes/src/test/java/org/apache/curator/framework/recipes/nodes/TestPersistentNode.java —
@@ -136,4 +138,52 @@ public void testQuickCloseNodeExists() throws Exception
CloseableUtils.closeQuietly(client);
}
}
+
+ @Test
+ public void testInterruption() throws Exception
— End diff –

This issue is a root cause of some significant problems we've encountered. I'll have another look at this and come up with a more pertinent test case.

jira-importer avatar Jan 10 '17 17:01 jira-importer

githubbot:

Github user thecoop closed the pull request at:

https://github.com/apache/curator/pull/186

jira-importer avatar Jan 10 '17 17:01 jira-importer

githubbot:

Github user Randgalt commented on a diff in the pull request:

https://github.com/apache/curator/pull/186#discussion_r95421477

— Diff: curator-recipes/src/test/java/org/apache/curator/framework/recipes/nodes/TestPersistentNode.java —
@@ -136,4 +138,52 @@ public void testQuickCloseNodeExists() throws Exception
CloseableUtils.closeQuietly(client);
}
}
+
+ @Test
+ public void testInterruption() throws Exception
— End diff –

We'll need that otherwise I don't see the issue. The thread interrupt status is boolean. No matter how many times you set it it's still the same value.

jira-importer avatar Jan 10 '17 18:01 jira-importer