quarkus icon indicating copy to clipboard operation
quarkus copied to clipboard

Fixed concurrency bug in StartupContext

Open Arcane561 opened this issue 11 months ago • 28 comments

Fixed the issue of parallel addition and reading after. ArrayList is not a concurrency safe collection and subsequent threads may not see the changes. The add streams and read streams are different. And shutdown events can also be added during the application (pg vertex as an example) and this does not happen in the main thread. Please consider the corrections

Arcane561 avatar Mar 13 '24 20:03 Arcane561

Thanks for the contribution, but I'd like to know whether you are actually encountered a problem that this change fixes.

My point is that theoretically, sure the addShutdownTask and addLastShutdownTask could be called on any thread, but I am pretty certain that is not the case currently - all tasks are added and run on the main thread.

geoand avatar Mar 14 '24 07:03 geoand

Thanks for the contribution, but I'd like to know whether you are actually encountered a problem that this change fixes.

My point is that theoretically, sure the addShutdownTask and addLastShutdownTask could be called on any thread, but I am pretty certain that is not the case currently - all tasks are added and run on the main thread.

Have a nice day, thank you for reviewing it so quickly. To be honest, I didn't have any problems with this specifically (although I didn't test it in detail). I was developing an extension that would allow you to accurately register the shutdownhandler after all the infrastructure addons have already registered their shutdown handlers (since, for example, I need to leave an entry in the database if the program terminates, but now it turns out that pgpoll closes before the code in the @Shutdown methods is called) and I noticed, that changing the collection and reading when the application is closed occurs in different thread (screenshots are attached below). I have prepared a test project for demonstration.

Repository https://github.com/Arcane561/quarkus-concurrency-example

To run it, you need a postgres database, and you will also need changes to the properties. https://github.com/Arcane561/quarkus-concurrency-example/blob/master/src/main/resources/application.properties

To quickly launch postgres, I provide the following commands

docker run -d -p 127.0.0.1:5432:5432 \      
  --name some-timescaledb-example \
  -e POSTGRES_PASSWORD=example \
  -e POSTGRES_USER=example \
  -e POSTGRES_DB=example \
  timescale/timescaledb:latest-pg14

and then the property needs to be changed to

quarkus.datasource.db-kind=postgresql
quarkus.datasource.username=example
quarkus.datasource.password=example
quarkus.datasource.reactive.url=postgresql://localhost:5432/example

The debager stop points should be placed here: https://github.com/quarkusio/quarkus/blob/67d73901be2901033f1417c9377ae092b679bcb3/core/runtime/src/main/java/io/quarkus/runtime/StartupContext.java#L33

https://github.com/quarkusio/quarkus/blob/67d73901be2901033f1417c9377ae092b679bcb3/core/runtime/src/main/java/io/quarkus/runtime/StartupContext.java#L73 Screenshot_20240315_004153 Screenshot_20240315_003708

Arcane561 avatar Mar 14 '24 22:03 Arcane561

Thanks for the feedback, but unless I am missing something, I don't see what you describe in the code you uploaded.

geoand avatar Mar 15 '24 07:03 geoand

Thanks for the feedback, but unless I am missing something, I don't see what you describe in the code you uploaded.

Good day to you, to eliminate misunderstandings, we are now talking about the code in https://github.com/Arcane561/quarkus-concurrency-example ? if so, then I can clearly see on 2 independent machines how PgPollRecorder adds a Shutdown handler not in the main thread, but in the thread of my ScheduledExecutorService. And also the reading does not take place in the main thread in the shutdown thread. I can describe in more detail the points of how and what to run and to reproduce this behavior

Arcane561 avatar Mar 15 '24 10:03 Arcane561

Ah okay, I thought you had added code that adds a shutdown handler.

I'll look into it more soon.

geoand avatar Mar 15 '24 10:03 geoand

if so, then I can clearly see on 2 independent machines how PgPollRecorder adds a Shutdown handler not in the main thread

I am not seeing this...

geoand avatar Mar 15 '24 12:03 geoand

I am not seeing this...

Just in case, I notify you that it is being PgPollRecorder constructed (and therefore will add a shutdown event) only 2 seconds after the application is launched. This is regulated by this line https://github.com/Arcane561/quarkus-concurrency-example/blob/e28ab236f3362a97137120c9306e42ea902047a5/src/main/java/org/quarkus/ExampleService.java#L24

Arcane561 avatar Mar 15 '24 12:03 Arcane561

Never mind my comment above, I see this as well.

geoand avatar Mar 15 '24 12:03 geoand

Okay, I see why this happens.

It comes down to the fact that your Database bean is "initialized" on your custom executor. If you change your Database bean to @Singleton, that results in the bean actually being initialized eagerly on the main thread.

Given how specific this is and how easily it is to work around it, I am not inclined to introduce the thread safe variants in StartupContext

geoand avatar Mar 15 '24 12:03 geoand

Okay, I see why this happens.

It comes down to the fact that your Database bean is "initialized" on your custom executor. If you change your Database bean to @Singleton, that results in the bean actually being initialized eagerly on the main thread.

Given how specific this is and how easily it is to work around it, I am not inclined to introduce the thread safe variants in StartupContext

Screenshot_20240315_152704

Unfortunately, this will not help

Arcane561 avatar Mar 15 '24 12:03 Arcane561

And I would also like to draw attention to the fact that shutdownthread is not the main thread either.

Screenshot_20240315_004153

Arcane561 avatar Mar 15 '24 12:03 Arcane561

Unfortunately, this will not help

I tried it and it does :)

geoand avatar Mar 15 '24 12:03 geoand

It seems that it's indeed not determenistic with @Singleton.

Even so, I want to try everything to avoid changing StartupContext

geoand avatar Mar 15 '24 12:03 geoand

It seems that it's indeed not determenistic with @Singleton.

Even so, I want to try everything to avoid changing StartupContext

I understand you, you can close my pr if you think you can do something better and more convenient for you. It's just that there's a race there right now and it scares me a little bit.

Arcane561 avatar Mar 15 '24 12:03 Arcane561

I need to think about it more

geoand avatar Mar 15 '24 12:03 geoand

I need to think about it more

And I have a big request for you, if you decide not to add concurrent collections, then please think about making sure that the @Shutdown methods are always executed first, without exceptions, because now I have to wait until PgPollRecorder is constructed and add my shutdown handler with the completion event

Arcane561 avatar Mar 15 '24 12:03 Arcane561

Right, I need to see why that happens

geoand avatar Mar 15 '24 12:03 geoand

@geoand Good day to you. Is there any news? If I can help you in any way, please let me know.

Arcane561 avatar Mar 21 '24 18:03 Arcane561

👋

I haven't had any time for it unfortunately

geoand avatar Mar 21 '24 18:03 geoand

@geoand Have a nice day. Have you not been able to watch it?

Arcane561 avatar Mar 28 '24 15:03 Arcane561

Hi,

I have not, no time.

Since I doubt I will have to look at this any time soon, maybe @mkouba can have a look and decide if we want to go down the route of your PR (which admitedly can't cause any harm other a potentially tiny slowdown)

geoand avatar Mar 28 '24 15:03 geoand

Hi,

I have not, no time.

Since I doubt I will have to look at this any time soon, maybe @mkouba can have a look and decide if we want to go down the route of your PR (which admitedly can't cause any harm other a potentially tiny slowdown)

Thank you very much for the answer, sorry to interrupt (

Arcane561 avatar Mar 28 '24 15:03 Arcane561

Not a problem :)

geoand avatar Mar 28 '24 15:03 geoand

Hi,

I have not, no time.

Since I doubt I will have to look at this any time soon, maybe @mkouba can have a look and decide if we want to go down the route of your PR (which admitedly can't cause any harm other a potentially tiny slowdown)

Well, given the fact that the list of shutdown tasks can be modified in a recorder method and there's no guarantee that it will be modified on the main thread I think that we should use a thread-safe structure here. I'm not so sure about ConcurrentLinkedDeque though. Collections.synchronizedList(List<T>) might be a more "straightforward" replacement?

mkouba avatar Apr 02 '24 09:04 mkouba

Good day to you mkouba. I used ConcurrentLinkedDeque for several reasons. The first is non-blocking addition and the second is that you don't have to reverse the list when reading handlers. I decided as an experiment to call the service in rest controller . And I found out that the handler is also added not in the main thread, but in threads serving the server (in vertx it is vertx poll threads, and in blocking it is executor).

non blocking rest point:

1

blocking rest point: 2

Arcane561 avatar Apr 02 '24 21:04 Arcane561

@mkouba Have a nice day. Could you tell me who I should work with on this pr now?

Arcane561 avatar Apr 05 '24 13:04 Arcane561

Good day to you @geoand , sorry if I am distracting could you please give feedback on what we will do about the problem...

Arcane561 avatar Apr 10 '24 12:04 Arcane561

Let's wait for @mkouba's feedback

geoand avatar Apr 10 '24 12:04 geoand

@mkouba Corrected, please check back when you have time.

Arcane561 avatar Apr 29 '24 16:04 Arcane561

@mkouba Corrected, please check back when you have time.

Arcane561 avatar May 08 '24 13:05 Arcane561