camel icon indicating copy to clipboard operation
camel copied to clipboard

Another set of code related warnings fixes

Open gnodet opened this issue 1 year ago • 10 comments

  • Fix visibility issues
  • Fix busy-wait loop issues

gnodet avatar May 22 '24 08:05 gnodet

:star2: Thank you for your contribution to the Apache Camel project! :star2:

:robot: CI automation will test this PR automatically.

:camel: Apache Camel Committers, please review the following items:

  • First-time contributors require MANUAL approval for the GitHub Actions to run

  • You can use the command /component-test (camel-)component-name1 (camel-)component-name2.. to request a test from the test bot.

  • You can label PRs using build-all, build-dependents, skip-tests and test-dependents to fine-tune the checks executed by this PR.

  • Build and test logs are available in the Summary page. Only Apache Camel committers have access to the summary.

  • :warning: Be careful when sharing logs. Review their contents before sharing them publicly.

github-actions[bot] avatar May 22 '24 08:05 github-actions[bot]

There's another busy-wait loop issue in org.apache.camel.support.Task. This is usually not a problem when performing non trivial tasks and re-attempting again in case of a failure, such as trying to connect to a remote server every 5 seconds. However, this package is also used to wait for trivial conditions, such as in the ServicePool where a foreground task is used to wait for a service to be started every 5 milliseconds. I think this is an abuse of the framework. For such cases, I think an alternative strategy should be provided, using an implementation of java.util.concurrent.Condition that can be waited for.

gnodet avatar May 22 '24 08:05 gnodet

There's another busy-wait loop issue in org.apache.camel.support.Task. This is usually not a problem when performing non trivial tasks and re-attempting again in case of a failure, such as trying to connect to a remote server every 5 seconds. However, this package is also used to wait for trivial conditions, such as in the ServicePool where a foreground task is used to wait for a service to be started every 5 milliseconds. I think this is an abuse of the framework. For such cases, I think an alternative strategy should be provided, using an implementation of java.util.concurrent.Condition that can be waited for.

Upon further investigation, the potential problem comes from this commit: https://github.com/apache/camel/commit/07b9c68ea2a9a70cf3a3f42c1045fb9fd0acbe6a I'm not sure how this is supposed to work: services usually inherit from ServiceSupport class where starting the service is a blocking operation. When the service is created, we call addService with forceStart set to true. My understanding is that returning from this call, the service will either be started, or the start will be deferred, but I don't really see how it can be in the STARTING state. @orpiske ?

gnodet avatar May 22 '24 08:05 gnodet

There's another busy-wait loop issue in org.apache.camel.support.Task. This is usually not a problem when performing non trivial tasks and re-attempting again in case of a failure, such as trying to connect to a remote server every 5 seconds. However, this package is also used to wait for trivial conditions, such as in the ServicePool where a foreground task is used to wait for a service to be started every 5 milliseconds. I think this is an abuse of the framework. For such cases, I think an alternative strategy should be provided, using an implementation of java.util.concurrent.Condition that can be waited for.

Upon further investigation, the potential problem comes from this commit: 07b9c68 I'm not sure how this is supposed to work: services usually inherit from ServiceSupport class where starting the service is a blocking operation. When the service is created, we call addService with forceStart set to true. My understanding is that returning from this call, the service will either be started, or the start will be deferred, but I don't really see how it can be in the STARTING state. @orpiske ?

The key problem I was avoiding on 07b9c68ea2a9a70cf3a3f42c1045fb9fd0acbe6a, was the if (producer instanceof StatefulService) { which caused a type check miss under high load.

The STARTING state check had been there before. Looking at the history of this change, it seems that the STARTING state check is related to CAMEL-14048 and a problem with Netty.

orpiske avatar May 22 '24 09:05 orpiske

@gnodet it would probably be good to run a full test on the code before merging. I've triggered a build on my own CI to see how it goes.

orpiske avatar May 22 '24 09:05 orpiske

There's another busy-wait loop issue in org.apache.camel.support.Task. This is usually not a problem when performing non trivial tasks and re-attempting again in case of a failure, such as trying to connect to a remote server every 5 seconds. However, this package is also used to wait for trivial conditions, such as in the ServicePool where a foreground task is used to wait for a service to be started every 5 milliseconds. I think this is an abuse of the framework. For such cases, I think an alternative strategy should be provided, using an implementation of java.util.concurrent.Condition that can be waited for.

Upon further investigation, the potential problem comes from this commit: 07b9c68 I'm not sure how this is supposed to work: services usually inherit from ServiceSupport class where starting the service is a blocking operation. When the service is created, we call addService with forceStart set to true. My understanding is that returning from this call, the service will either be started, or the start will be deferred, but I don't really see how it can be in the STARTING state. @orpiske ?

The key problem I was avoiding on 07b9c68ea2a9a70cf3a3f42c1045fb9fd0acbe6a, was the if (producer instanceof StatefulService) { which caused a type check miss under high load.

The STARTING state check had been there before. Looking at the history of this change, it seems that the STARTING state check is related to CAMEL-14048 and a problem with Netty.

Thx for the pointers. I've had another look, I think the fix https://github.com/apache/camel/commit/0f9f8790cccf3e45af4fe9c46ec030c5e63d2083 is wrong and the problem was actually fixed correctly by https://github.com/apache/camel/commit/4c8c071fba493dbf41f09103712036805133a5a8. This commit makes sure that the double-check locking only see a fully started service.

I'll add a commit to remove that part of the code.

gnodet avatar May 22 '24 11:05 gnodet

@gnodet it would probably be good to run a full test on the code before merging. I've triggered a build on my own CI to see how it goes.

Agreed. The PR template mentions build-all but that's not an existing label, I've seen incremental-build-all however. Is that the correct one ? We need to fix the template btw...

gnodet avatar May 22 '24 11:05 gnodet

There's another busy-wait loop issue in org.apache.camel.support.Task. This is usually not a problem when performing non trivial tasks and re-attempting again in case of a failure, such as trying to connect to a remote server every 5 seconds. However, this package is also used to wait for trivial conditions, such as in the ServicePool where a foreground task is used to wait for a service to be started every 5 milliseconds. I think this is an abuse of the framework. For such cases, I think an alternative strategy should be provided, using an implementation of java.util.concurrent.Condition that can be waited for.

Upon further investigation, the potential problem comes from this commit: 07b9c68 I'm not sure how this is supposed to work: services usually inherit from ServiceSupport class where starting the service is a blocking operation. When the service is created, we call addService with forceStart set to true. My understanding is that returning from this call, the service will either be started, or the start will be deferred, but I don't really see how it can be in the STARTING state. @orpiske ?

The key problem I was avoiding on 07b9c68ea2a9a70cf3a3f42c1045fb9fd0acbe6a, was the if (producer instanceof StatefulService) { which caused a type check miss under high load. The STARTING state check had been there before. Looking at the history of this change, it seems that the STARTING state check is related to CAMEL-14048 and a problem with Netty.

Thx for the pointers. I've had another look, I think the fix 0f9f879 is wrong and the problem was actually fixed correctly by 4c8c071. This commit makes sure that the double-check locking only see a fully started service.

I'll add a commit to remove that part of the code.

Pushed a commit. However I think the ServicePool still has some synchronization issues. We use a BlockingQueue, but we're holding a lock to access it, so that's quite unexpected... That's for another PR though, maybe with the ResequencerEngine concurrency to avoid locking on the object itself.

gnodet avatar May 22 '24 11:05 gnodet

@gnodet it would probably be good to run a full test on the code before merging. I've triggered a build on my own CI to see how it goes.

Agreed. The PR template mentions build-all but that's not an existing label, I've seen incremental-build-all however. Is that the correct one ? We need to fix the template btw...

I think these labels are outdated. IIRC, we never managed to run full builds on GH because it takes too long and causes the action runner to abort after some time.

For now, the only way to run a full test is by running the tests manually (or, semi automatically, as I do on my environment).

orpiske avatar May 22 '24 12:05 orpiske

First run (before the last commit pushed) had only 1 failure on org.apache.camel.component.infinispan.remote.InfinispanRemoteProducerIT.replaceAValueByKeyAsyncWithOldValue.

I have a new build running with the latest commit.

orpiske avatar May 22 '24 13:05 orpiske