flink icon indicating copy to clipboard operation
flink copied to clipboard

[FLINK-34409][ci] Enables (most) tests that were disabled for the AdaptiveScheduler due to missing features

Open XComp opened this issue 1 year ago • 5 comments

What is the purpose of the change

Increasing test coverage.

Brief change log

  • Removes annotations

Verifying this change

This change is a trivial rework / code cleanup without any test coverage. PR is temporarily enabling the AdaptiveScheduler

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

XComp avatar Feb 07 '24 15:02 XComp

CI report:

  • 87a3ce53377b005461b331cb9724a19290061fdd Azure: SUCCESS
Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

flinkbot avatar Feb 07 '24 15:02 flinkbot

Should we check if any on the other tests that are currently disabled for AdaptiveScheduler would now work, e.g. ?

Yikes, that was actually my intention. But looks like I forgot to do another pass over the e2e test scripts. :-D Thanks. I will enable those as well. We should add a comment with the reason if we notice that they still fail. :+1:

XComp avatar Feb 07 '24 15:02 XComp

I went a bit naively into this task. :-D Some of the tests are still failing with the AdaptiveScheduler. I analyzed the test failures and added a proper reasoning for the future to document why the tests are disabled (if it was a conceptual issue).

XComp avatar Feb 08 '24 16:02 XComp

@flinkbot run azure

XComp avatar Feb 08 '24 18:02 XComp

I force-pushed a rebase to most-recent master w/o any other changes. I suspect that the test failures we were seeing in CI are related to FLINK-34420

XComp avatar Feb 20 '24 15:02 XComp

I fixed an indentation issue and rebased the branch.

@MartijnVisser @JingGe can one of you go ahead and approve this one? So that we can finalize the efforts and merge this PR.

I'm going to create backports as well to enable the tests in the release branches as well. I guess, increasing the test coverage also in those branches doesn't hurt.

XComp avatar Mar 15 '24 08:03 XComp

should it be rebased because of FLINK-34718 to be sure that ci is green?

PR is temporarily enabling the AdaptiveScheduler

this thing from PR description is not clear to me if we temporarily enable it, what is the long term plan here?

snuyanzin avatar Mar 21 '24 15:03 snuyanzin

this thing from PR description is not clear to me if we temporarily enable it, what is the long term plan here?

I was referring to debug commit https://github.com/apache/flink/pull/24285/commits/d0ea2d70bffaef5a03669240564e148ab6ea14d3 in this PR that enables the AdaptiveScheduler profile to verify the test execution but won't be merged into master

XComp avatar Mar 21 '24 15:03 XComp

I rebased the branch to most-recent master. But it won't make much of a difference: The failed tests in CI are unrelated to the changes of this PR.

XComp avatar Mar 21 '24 15:03 XComp

I created FLINK-34921 to cover the one test failure which should be unrelated to this PR's change. The second test failure is related to FLINK-34643

XComp avatar Mar 22 '24 14:03 XComp

CI with AdaptiveScheduler looks good. Only FLINK-34643 is failing. I'm gonna revert the CI change to make the PR ready to be merged.

XComp avatar Mar 23 '24 14:03 XComp