testcontainers-java icon indicating copy to clipboard operation
testcontainers-java copied to clipboard

Solves: 5359 by using a more flexible wait strategy

Open TomCools opened this issue 3 years ago • 7 comments

Hi all,

This is a first draft of a change made in regards to #5359. We (@kiview & @eddumelendez ) agreed in that issue that we'd need a more flexible wait-strategy to fix that issue. In this draft for now I have added 2 experimental wait-strategies.

  • HoggingLogMessageWaitStrategy: keeps all logs in a single StringBuffer. All regex checks are performed against the full log history.
  • MultiLogMessageWaitStrategy: Allows a Deque of String Regex to be added. Every time an OutputFrame matches the first regex in the Deque, we pop it. The wait is over when the Deque is empty.

I added some more details in comments at the top of the class

What I would like from you:

  • Feedback on the general approach of the WaitStrategies (API, internals, anything really) and some indication of your preferences.

Yet to do:

  • Write tests
  • Harden the solution by adding some logging, inputchecks, etc.
  • Use the new WaitStrategy with the PostgreSQLContainer. (could be new merge request!)

TomCools avatar Jun 15 '22 19:06 TomCools

thanks for the draft @TomCools ! both approaches look great. I would say that given for what I have seen not always all the logs are needed. I would be taking the approach for MultiLogMessageWaitStrategy for Postgres.

Just throwing an idea here: what if LogMessageWaitStrategy offer a waitPredicate and it is override by those proposals?

eddumelendez avatar Jun 16 '22 22:06 eddumelendez

Just throwing an idea here: what if LogMessageWaitStrategy offer a waitPredicate and it is override by those proposals?

Not a big fan of this to be honest.
It requires either inheritance, which isn't possible because the class interface wouldn't make sense (as both my proposals remove the "times()" method).

Alternatively, passing in the waitPredicate through a setter suffers from the same issues and would make the whole thing maybe a bit too flexible <=> usability of the class?

TomCools avatar Jun 17 '22 13:06 TomCools

We could make an abstract class that includes most of the logic, has an abstract method for the waitstrategy, then make the current LogMessageWaitStrategy extend that (as to not break the existing interface), then make the MultiLog extend that same abstract class.

How about that @eddumelendez? (if you are not too busy being a Java Champion <3)

TomCools avatar Jun 21 '22 10:06 TomCools

Added an example of what such an abstract class could look like.

TomCools avatar Jun 21 '22 18:06 TomCools

We could make an abstract class that includes most of the logic, has an abstract method for the waitstrategy, then make the current LogMessageWaitStrategy extend that (as to not break the existing interface), then make the MultiLog extend that same abstract class.

I like this idea. Thanks for already submit an implementation @TomCools ! 👍🏽

eddumelendez avatar Jun 23 '22 02:06 eddumelendez

Just to bring some of the Slack debugging conversations into this PR for further visibility:

One part of the logs come on STDERR, the other on STDOUT. All proposed strategies seem to have a race condition between the streams, even the Hogging strategy.

I think we, unfortunately, need an implementation like this:

  • Wait for one regex on STDOUT
  • Depending on which regex matched, wait for a different regex on STDERR

Does not look ideal, just a specific workaround for Postgres.

kiview avatar Jul 18 '22 13:07 kiview

Maybe instead, we should consider documenting the workaround (with a custom waiter) and forget about trying to fix it in the class? Or we could add something like .withExistingData() which will set the correct waiter? Puts the reponsability with the user tho.

TomCools avatar Jul 19 '22 08:07 TomCools

Hi all,

since I came across a similar issue: I wanted to have a safe check if my postgres is ready instead of a regEx check, which I find it to be somewhat unreliable.

I made a custom wait strategy specific for postgres using pg_isready.

  • https://stackoverflow.com/a/72175755/11770752

When executing pg_isready following result is display

image

Custom WaitStrategy

package container;

import java.io.IOException;
import org.assertj.core.api.Assertions;
import org.testcontainers.containers.Container;
import org.testcontainers.containers.wait.strategy.AbstractWaitStrategy;

public class PostgresWaitStrategy extends AbstractWaitStrategy {

    private final long timeoutInMillis;

    private PostgresWaitStrategy(long timeoutInMillis) {
        this.timeoutInMillis = timeoutInMillis;
    }

    public static PostgresWaitStrategy create() {
        return new PostgresWaitStrategy(10_000L);
    }

    public static PostgresWaitStrategy withTimeoutInMillis(long timeoutInMillis) {
        return new PostgresWaitStrategy(timeoutInMillis);
    }

    public static PostgresWaitStrategy withTimeoutInSeconds(long timeoutInSeconds) {
        long timeoutInMillis = timeoutInSeconds * 1_000L;
        return new PostgresWaitStrategy(timeoutInMillis);
    }

    @Override
    protected void waitUntilReady() {
        final long startInMillis = System.currentTimeMillis();
        long elapsedTimeInMillis;

        boolean databaseIsReady = false;
        while (!databaseIsReady) {
            Container.ExecResult execResult = checkIfDatabaseIsReady();
            databaseIsReady = isDatabaseReadyEvaluation(execResult);

            elapsedTimeInMillis = elapsedTimeInMillis(startInMillis);
            checkIfTimedOut(elapsedTimeInMillis);
        }
    }

    private Container.ExecResult checkIfDatabaseIsReady() {
        try {
            return waitStrategyTarget.execInContainer("pg_isready", "-d", "db_prod");
        } catch (IOException | InterruptedException e) {
            Assertions.fail("Error during postgres container execution: ", e);
            throw new RuntimeException(e);
        }
    }

    private boolean isDatabaseReadyEvaluation(Container.ExecResult execResult) {
        int exitCode = execResult.getExitCode();
        String message = execResult.getStdout();
        return exitCode == 0 && message.contains("accepting connections");
    }

    private long elapsedTimeInMillis(long startInMillis) {
        return System.currentTimeMillis() - startInMillis;
    }

    private void checkIfTimedOut(long elapsedTimeInMillis) {
        if (elapsedTimeInMillis > timeoutInMillis) {
            long waitThresholdInSeconds = timeoutInMillis / 1_000L;
            throw new RuntimeException(
                String.format(
                    "Timed out after %d seconds waiting for postgres to accept connections.",
                    waitThresholdInSeconds
                )
            );
        }
    }
}

I don't know if it helps in your case, when you already have data, but it may be worth a try. :-)

Anyway I was considering if it's worth it to make a PR to add this wait strategy since a lot of people are using postgres, but I guess it's not intended to have specific waiting strategies?

anjeyy avatar Sep 25 '22 11:09 anjeyy

@anjeyy Thanks for adding that information! We have actually tried it with data before and there is a race condition. The problem is, if there is data present, Postgres will start 2 times, so this happens:

  • startup 1 time (complete)
  • pg_isready: WE GOOD TO GO! :-)
  • startup again, but tests have already started... :(

The easiest solution I've had to far was to just use the Regex one, and expect it to log 2 times if there is data.

TomCools avatar Sep 25 '22 11:09 TomCools

Well, it's been a year. 😄 Time to get this going again. Rebased and force pushed, made some initial changes again. Need to harden the solution a bit, but the local tests are working just fine! Looking forward to applying all this stuff and making a nice blog post about this.

TomCools avatar Jul 17 '23 20:07 TomCools