eclipse.platform.releng.aggregator icon indicating copy to clipboard operation
eclipse.platform.releng.aggregator copied to clipboard

Unify Y-build and I-build configurations and their tests

Open HannesWell opened this issue 1 year ago • 33 comments

Recently I have reworked the definition of I-build tests to reduce duplication and to make them more unified and simpler. This has the effect that the definition of Y-build tests has now significantly diverged from the I-build tests and I intended to apply the same simplifications for Y-build tests as well.

And while looking into all the definitions of I-builds and Y-builds I noticed that they have a lot in common and wonder respectively think we should try to unify their definitions and try to define both pipelines from one source, while considering the points where they are different. But this leads to the questions what exactly the differences between I-builds and Y-builds are? What I have found so far is

  • Y-builds built another branch than the master, at least for jdt.core the development branch of ECJ supporting the next Java release
  • Y-build tests have one linux setup that uses the latest EA JDK instead of the latest release
  • The results of Y-builds are published to another website than I-builds
  • Y-builds publish their build p2-repo e.g. to https://download.eclipse.org/eclipse/updates/4.35-Y-builds/ instead of https://download.eclipse.org/eclipse/updates/4.35-I-builds/
  • Y-builds don't publish to a Maven snapshot repository (I assume)

Together with https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/issues/1950 this should to further simplify the overall releng. Of course a unification would have to happen in multiple steps.

An alternative solution would be to just develop ECJ's support for upcoming java versions on the master branch and stop using another branch. AFAICT this has already been discussed respectively is currently discussed but will not be implemented soon.

The questions I have to the JDT-team:

  • Are you fine with the proposed general goal?
  • Who of you is interested to discuss the details that come along the path of making this happen?

@jarthana, @stephan-herrmann or @iloveeclipse are you interested in discussing this or can you tell who is interested?

Community

  • [x] I understand suggesting an enhancement doesn't mandate anyone to implement it. Other contributors may consider this suggestion, or not, at their own convenience. The most efficient way to get it fixed is that I implement it myself and contribute it back as a good quality patch to the project.

HannesWell avatar Dec 03 '24 23:12 HannesWell

I am interested in seeing successful Y-builds with useful test result, ideally with no big gaps around the times of release, or infra changes. If unification helps to this end, then surely I fully agree with the goals, and your description looks correct to me.

I don't, however, have much insight into the inner workings of I-builds nor Y-builds, so I don't know in which detail discussion I could help. I'm already looking into P-builds, which is about all I can contribute in terms of releng builds.

And while it's correct that the matter of master vs BETA_JAVA24 (all of jdt, not just jdt.core) is being discussed widely and for a long time already, it's pretty clear that the mere fact of branching will stay for the foreseeable future, for more than one reason.

stephan-herrmann avatar Dec 04 '24 00:12 stephan-herrmann

I am interested in seeing successful Y-builds with useful test result, ideally with no big gaps around the times of release, or infra changes. If unification helps to this end, then surely I fully agree with the goals, and your description looks correct to me.

It will simplify the maintenance of the Y-build and will make it quicker to apply changes and it will also mean that the basic configurations that are meant to be equal, stay equal. So I think this could be a contribution to that overall goal.

I don't, however, have much insight into the inner workings of I-builds nor Y-builds, so I don't know in which detail discussion I could help. I'm already looking into P-builds, which is about all I can contribute in terms of releng builds.

It's mainly when I encounter differences to the I-build and I have to find out if they are really necessary or not. Then I hope to get an answer from you or somebody else from the JDT-team. But the technical Jenkins details I can hopefully handle by myself.

And while it's correct that the matter of master vs BETA_JAVA24 (all of jdt, not just jdt.core) is being discussed widely and for a long time already, it's pretty clear that the mere fact of branching will stay for the foreseeable future, for more than one reason.

Yes, I just wanted to mention that option, knowing that it's at the moment not feasible. :)

HannesWell avatar Dec 04 '24 01:12 HannesWell

For your information I have now done the following further step to unify Y-build and I-build tests:

  • https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/pull/2633
  • https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/pull/2634
  • https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/pull/2635

The tasks executed respectively the behavior should be the same, but please let me know if you encounter any problem upon the next execution. Respectively if you don't plan to run another Y-build soon, do you mind if I start one?

Does the JDT team mind if obsolete Y-build test jobs, i.e. those that are no longer in use, are removed from https://ci.eclipse.org/releng/job/YPBuilds/ or do you need them for some time for reference?

HannesWell avatar Dec 05 '24 23:12 HannesWell

For your information I have now done the following further step to unify Y-build and I-build tests:

To me it looks like the first execution after these changes were ok, without regressions due to infrastructure changes.

@stephan-herrmann or @jarthana I have two questions about the Y-build setup:

  1. Is there a reason why Y-build tests don't run on windows? Is it just to save precious computation time on the windows machine?

  2. Would it be possible to use the jdk installations provided by default for Eclipse Jenkins instances in Y-build tests on Linux? At the moment specific open-jdks are download for each build: https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/blob/768f0428136f3abcd85fd5e00b6f3d435abc0f80/JenkinsJobs/YBuilds/Y_unit_linux.groovy#L5-L7 But EF Jenkins provides all these JDKs already, even EA builds for not yet released versions: See 'openjdk-ea-latest' at https://github.com/eclipse-cbi/jiro/wiki/Tools-(JDK,-Maven,-Ant)#openjdk @fredg02 can you tell how often the EA builds are updated for all Jenkins instances?

To the JDT-devs: Would you be fine with changing to the provided 'openjdk-ea-latest' for the Y-build Java 24 tests and switch to temurin for all others? Just like all other builds: https://github.com/eclipse-cbi/jiro/wiki/Tools-(JDK,-Maven,-Ant)#eclipse-temurin

HannesWell avatar Dec 07 '24 13:12 HannesWell

Does the JDT team mind if obsolete Y-build test jobs, i.e. those that are no longer in use, are removed from https://ci.eclipse.org/releng/job/YPBuilds/ or do you need them for some time for reference?

IMHO old jobs can be removed once the current set of jobs (build & test) is functional.

Let's perhaps just wait for one more +1 from @jarthana or @mpalat

stephan-herrmann avatar Dec 08 '24 13:12 stephan-herrmann

  1. Is there a reason why Y-build tests don't run on windows? Is it just to save precious computation time on the windows machine?

I'll have to pass this one, as I don't know the reason.

  1. Would it be possible to use the jdk installations provided by default for Eclipse Jenkins instances in Y-build tests on Linux? At the moment specific open-jdks are download for each build: https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/blob/768f0428136f3abcd85fd5e00b6f3d435abc0f80/JenkinsJobs/YBuilds/Y_unit_linux.groovy#L5-L7

But EF Jenkins provides all these JDKs already, even EA builds for not yet released versions: See 'openjdk-ea-latest' at https://github.com/eclipse-cbi/jiro/wiki/Tools-(JDK,-Maven,-Ant)#openjdk @fredg02 can you tell how often the EA builds are updated for all Jenkins instances?

IMHO, the specific JDK version is not relevant here, as long as we have some "good" EA build of the current stream.

To the JDT-devs: Would you be fine with changing to the provided 'openjdk-ea-latest' for the Y-build Java 24 tests and switch to temurin for all others? Just like all other builds: https://github.com/eclipse-cbi/jiro/wiki/Tools-(JDK,-Maven,-Ant)#eclipse-temurin

No objections from my side.

stephan-herrmann avatar Dec 08 '24 13:12 stephan-herrmann

Does the JDT team mind if obsolete Y-build test jobs, i.e. those that are no longer in use, are removed from https://ci.eclipse.org/releng/job/YPBuilds/ or do you need them for some time for reference?

IMHO old jobs can be removed once the current set of jobs (build & test) is functional.

Let's perhaps just wait for one more +1 from @jarthana or @mpalat

No objections from me either.

1. Is there a reason why Y-build tests don't run on windows? Is it just to save precious computation time on the windows machine?

Just checked with @MohananRahul who says that it's configured but not configured to run automatically.

To the JDT-devs: Would you be fine with changing to the provided 'openjdk-ea-latest' for the Y-build Java 24 tests and switch to temurin for all others? Just like all other builds: https://github.com/eclipse-cbi/jiro/wiki/Tools-(JDK,-Maven,-Ant)#eclipse-temurin

I agree with Stephan. No problem for me.

jarthana avatar Dec 09 '24 06:12 jarthana

  1. Is there a reason why Y-build tests don't run on windows? Is it just to save precious computation time on the windows machine?

Just checked with @MohananRahul who says that it's configured but not configured to run automatically.

Rahul just dug this up for me, thanks Rahul! https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/pull/457 Still doesn't explain why, though.

jarthana avatar Dec 09 '24 06:12 jarthana

@fredg02 can you tell how often the EA builds are updated for all Jenkins instances?

They are mostly updated on demand.

fredg02 avatar Dec 09 '24 10:12 fredg02

Thanks for your agreement, I have unified this with:

  • https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/pull/2690

Rahul just dug this up for me, thanks Rahul! #457

Thanks. Then just leave it as it is for now. The load is already high enough.

FYI after the I-build and all it's tests have been moved to Java-21 I applied the same change for the Y-build:

  • https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/pull/2691

HannesWell avatar Dec 19 '24 21:12 HannesWell

Test job https://ci.eclipse.org/releng/job/YPBuilds/job/ep435Y-unit-linux-x86_64-java24/ aborts since 2024-12-31.

Error is:

java.lang.OutOfMemoryError: Java heap space
Also:   org.jenkinsci.plugins.workflow.actions.ErrorAction$ErrorId: a030fa77-332d-4e01-8dec-f10f51ace8d9
Caused: java.io.IOException: Remote call on JNLP4-connect connection from 10.40.71.91/10.40.71.91:47894 failed
	at hudson.remoting.Channel.call(Channel.java:1116)
	at hudson.FilePath.act(FilePath.java:1228)
	at hudson.FilePath.act(FilePath.java:1217)
	at PluginClassLoader for junit//hudson.tasks.junit.JUnitParser.parseResult(JUnitParser.java:146)
	at PluginClassLoader for junit//hudson.tasks.junit.JUnitResultArchiver.parse(JUnitResultArchiver.java:177)
	at PluginClassLoader for junit//hudson.tasks.junit.JUnitResultArchiver.parseAndSummarize(JUnitResultArchiver.java:282)
	at PluginClassLoader for junit//hudson.tasks.junit.pipeline.JUnitResultsStepExecution.run(JUnitResultsStepExecution.java:62)
	at PluginClassLoader for junit//hudson.tasks.junit.pipeline.JUnitResultsStepExecution.run(JUnitResultsStepExecution.java:27)
	at PluginClassLoader for workflow-step-api//org.jenkinsci.plugins.workflow.steps.SynchronousNonBlockingStepExecution.lambda$start$0(SynchronousNonBlockingStepExecution.java:47)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:857)
Finished: FAILURE

Could this be connected to changes done here?

stephan-herrmann avatar Jan 04 '25 12:01 stephan-herrmann

Test job https://ci.eclipse.org/releng/job/YPBuilds/job/ep435Y-unit-linux-x86_64-java24/ aborts since 2024-12-31.

Error is:

java.lang.OutOfMemoryError: Java heap space

Could this be connected to changes done here?

I saw that as well, but since the jobs are more and more unified (which is the goal of this issue) and Java-23 respectively I-builds don't show this I was hoping that the cause is an increased memory consumption (or maybe even a memory leak?) with java-24. Is that possible?

To help us in finding the cause I replayed the latest build, with -XX:+HeapDumpOnOutOfMemoryError enabled (as described in https://www.baeldung.com/java-heap-dump-capture#automatically): https://ci.eclipse.org/releng/job/YPBuilds/job/ep435Y-unit-linux-x86_64-java24/15/

Maybe that's something we should enable in general because it has no cost if no OOM error happens and helps to track down memory issues (also for the builds not only the tests)?

HannesWell avatar Jan 04 '25 17:01 HannesWell

I was hoping that the cause is an increased memory consumption (or maybe even a memory leak?)

As the error happens while parsing test results, the leak might simply be a flood of failures that blows up memory. But due to the error we cannot see those test failures :-/

I wonder if the memory limit is defined by the executor or as a build parameter?

stephan-herrmann avatar Jan 04 '25 18:01 stephan-herrmann

I was hoping that the cause is an increased memory consumption (or maybe even a memory leak?)

As the error happens while parsing test results, the leak might simply be a flood of failures that blows up memory. But due to the error we cannot see those test failures :-/

Good point. I didn't look at the logs at that detail 😅

I wonder if the memory limit is defined by the executor or as a build parameter?

For Linux the tests run in a kubernetes pod and we can indeed use a pod with increased resource limits by replacing

agent {
  label 'ubuntu-2404'
}

with

agent {
  kubernetes {
    inheritFrom 'ubuntu-2404'
    yaml """
apiVersion: v1
kind: Pod
spec:
  containers:
  - name: "jnlp"
    resources:
      limits:
        memory: "8Gi"
        cpu: "4000m"
      requests:
        memory: "4Gi"
        cpu: "2000m"
"""
  }
}

I have started another replay where I applied this:

  • https://ci.eclipse.org/releng/job/YPBuilds/job/ep435Y-unit-linux-x86_64-java24/16

You can see the exact diff in the job configuration at (you probably have to be logged in into Jenkins):

  • https://ci.eclipse.org/releng/job/YPBuilds/job/ep435Y-unit-linux-x86_64-java24/16/replay/diff

HannesWell avatar Jan 04 '25 18:01 HannesWell

It failed again :/ But you could have a look at the test result files directly: https://ci.eclipse.org/releng/job/YPBuilds/job/ep435Y-unit-linux-x86_64-java24/16/artifact/workarea/Y20250104-1000/eclipse-testing/results/xml/

Some of them have multiple MiB, but I don't see an error or failure. And although the number of tests didn't change, the size of org.eclipse.jdt.core.tests.compiler suite almost doubled. Maybe the configuration changed? E.g. I see 22 hits for <testcase classname="org.eclipse.jdt.core.tests.compiler.regression.TestAll" name="test067 - 16" (randomly chosen)

Maybe this also leads to an endless loops in the test-result reading step?

HannesWell avatar Jan 05 '25 15:01 HannesWell

It failed again :/ But you could have a look at the test result files directly: https://ci.eclipse.org/releng/job/YPBuilds/job/ep435Y-unit-linux-x86_64-java24/16/artifact/workarea/Y20250104-1000/eclipse-testing/results/xml/

Some of them have multiple MiB, but I don't see an error or failure. And altought the number of tests didn't change, the size of org.eclipse.jdt.core.tests.compiler suite almost doubled. Maybe the configuraiton changed? E.g. I see (randomly choosen) 22 hits for <testcase classname="org.eclipse.jdt.core.tests.compiler.regression.TestAll" name="test067 - 16"

Maybe this also leads to a endless in the test-result reading step?

Thanks! This led to a candidate culprit: The recent merge from master to BETA_JAVA24 removed -Dcompliance.jre.24=1.8,11,17,21,24 from the test invocation resulting in all tests being executed at 16 distinct compliance levels rather than the intended 5. This didn't surface in PR builds, because the bogus change was in test.xml, which is not used in maven builds, but apparently this is what still drives tests during production builds.

Over to https://github.com/eclipse-jdt/eclipse.jdt.core/pull/3517

stephan-herrmann avatar Jan 05 '25 17:01 stephan-herrmann

Nice teamwork. 👍

merks avatar Jan 05 '25 18:01 merks

I have just pushed another unification of the result website via:

  • https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/pull/2754

The result shouldn't be really different as before.

@stephan-herrmann I'm also reading your thread on the jdt-dev mailing list about how to compile ECJ on the Y-build not with the latest I-build but with the latest Y-build (if I'm not mistaken that's the actual intention isn't it?). I have an idea for that, but have to verify it first before implementing it. I'll try to propose a patch in the next days.

HannesWell avatar Jan 16 '25 18:01 HannesWell

@stephan-herrmann I'm also reading your thread on the jdt-dev mailing list about how to compile ECJ on the Y-build not with the latest I-build but with the latest Y-build (if I'm not mistaken that's the actual intention isn't it?).

I think you got me right. There's just two things I'm not 100% sure about:

  • at least in PR builds we use the compiler built in the very same build, I assume this is not happening in any production builds?
  • you say "to compile ECJ", but isn't it actually about all compilation during that build, not just ECJ self-compilation?

At the bottom line, the same dog fooding from beta branches, as I-builds do for the master branch would be great (and for the rare use case discussed recently, this would prevent a strange hiccup).

stephan-herrmann avatar Jan 16 '25 19:01 stephan-herrmann

Hi @stephan-herrmann, I don't know if this is the right place to ask but I think it fits with the overall effort of these activities. Would it be possible to also set up a pipeline to generate maven snapshots of ecj with BETA_JAVA24 (and future releases)?

adisandro avatar Feb 20 '25 11:02 adisandro

Would it be possible to also set up a pipeline to generate maven snapshots of ecj with BETA_JAVA24 (and future releases)?

I think it would be feasible. We would have to make the source p2-repo from which to feed the Maven(snapshot) publication configurable: https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/blob/233bee458916f449cd31f7a5ecd21283112ac0d8/JenkinsJobs/Releng/FOLDER.groovy#L31-L51 Then this would have to considered in the aggregation: https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/blob/233bee458916f449cd31f7a5ecd21283112ac0d8/eclipse.platform.releng/publish-to-maven-central/SDK4Mvn.aggr#L5-L8

And then we probably want to limit the artifacts 'published' to only JDT or even less and probably also want to use another target-repository as well to not conflict with the I-build artifacts (that might have a newer time-stamp but don't have the BETA_JAVA support).

The first steps would also allow to simplify the release publication as we can just define the relese repository as a parameter without the need to have two commits/PRs to create necessary file-content, as for example done for this release:

  • https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/issues/2606
  • https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/pull/2613/files
  • https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/pull/2624/files

So it would be good in general to parameterize the Maven publication and if the JDT devs are interested I can also look further into making a publication of Maven snapshots happen. But I wont have the time to do it before April.

HannesWell avatar Feb 22 '25 10:02 HannesWell

Would it be possible to also set up a pipeline to generate maven snapshots of ecj with BETA_JAVA24 (and future releases)?

If only the ECJ is wanted, one can simply deply that single thing from the build with maven, there is no need to go through the full repository and aggregation...

laeubi avatar Feb 22 '25 10:02 laeubi

Would it be possible to also set up a pipeline to generate maven snapshots of ecj with BETA_JAVA24 (and future releases)?

If only the ECJ is wanted, one can simply deply that single thing from the build with maven, there is no need to go through the full repository and aggregation...

At least the current P(atch)-build consists of a few more bundles, so I assume these are the ones of interest and probably only work together? https://github.com/eclipse-jdt/eclipse.jdt/blob/35ccea5e5721b558f64f591c560d6b8d9ac1dc95/org.eclipse.jdt.releng/patchbuild/src/feature.xml.in#L24-L46

HannesWell avatar Feb 22 '25 11:02 HannesWell

Would it be possible to also set up a pipeline to generate maven snapshots of ecj with BETA_JAVA24 (and future releases)?

If only the ECJ is wanted, one can simply deply that single thing from the build with maven, there is no need to go through the full repository and aggregation...

At least the current P(atch)-build consists of a few more bundles, so I assume these are the ones of interest and probably only work together? https://github.com/eclipse-jdt/eclipse.jdt/blob/35ccea5e5721b558f64f591c560d6b8d9ac1dc95/org.eclipse.jdt.releng/patchbuild/src/feature.xml.in#L24-L46

That's the set of bundles needed in the IDE, but the batch compiler ecj can be used standalone or embedded in an application without any dependencies.

stephan-herrmann avatar Feb 27 '25 18:02 stephan-herrmann

Currently Y-builds fail:

[2025-03-04T15:05:26.224Z] [FATAL] Non-resolvable parent POM for org.eclipse.jdt:eclipse.jdt.core:4.35.0-SNAPSHOT: The following artifacts could not be resolved: org.eclipse:eclipse-platform-parent:pom:4.35.0-SNAPSHOT (absent): Could not find artifact org.eclipse:eclipse-platform-parent:pom:4.35.0-SNAPSHOT and 'parent.relativePath' points at wrong local POM @ line 19, column 11
[2025-03-04T15:05:26.224Z] [FATAL] Non-resolvable parent POM for eclipse.jdt.debug:eclipse.jdt.debug:4.35.0-SNAPSHOT: The following artifacts could not be resolved: org.eclipse:eclipse-platform-parent:pom:4.35.0-SNAPSHOT (absent): Could not find artifact org.eclipse:eclipse-platform-parent:pom:4.35.0-SNAPSHOT and 'parent.relativePath' points at wrong local POM @ line 15, column 11
[2025-03-04T15:05:26.224Z] [FATAL] Non-resolvable parent POM for eclipse.jdt.ui:eclipse.jdt.ui:4.35.0-SNAPSHOT: The following artifacts could not be resolved: org.eclipse:eclipse-platform-parent:pom:4.35.0-SNAPSHOT (absent): Could not find artifact org.eclipse:eclipse-platform-parent:pom:4.35.0-SNAPSHOT and 'parent.relativePath' points at wrong local POM @ line 19, column 11

Is there a connection to #2851, i.e., will there be fresh I-builds that provide the required snapshot parent?

Does the Y-build configuration need to search at a different location?

Should JDT features update their parent link to the release version 4.35.0?

Cc: @MohananRahul

stephan-herrmann avatar Mar 04 '25 22:03 stephan-herrmann

Next guess: https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/blob/master/cje-production/streams/repositories_java24.txt may need to s/master/R4_35_maintenance/ right?

stephan-herrmann avatar Mar 04 '25 23:03 stephan-herrmann

Next guess: https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/blob/master/cje-production/streams/repositories_java24.txt may need to s/master/R4_35_maintenance/ right?

We need to proceed with running the 4.35 YBuild from the following link: 4.35 YBuild , for which #2851 done. @HannesWell anything else we missed.

MohananRahul avatar Mar 05 '25 05:03 MohananRahul

I have done the last merge for the release from R4_35_maintenance for both jdt.core and jdt.ui.

jarthana avatar Mar 05 '25 06:03 jarthana

https://ci.eclipse.org/releng/job/YPBuilds/job/Y-build-4.35/104/ , YBuild is available

MohananRahul avatar Mar 05 '25 06:03 MohananRahul

Next guess: https://github.com/eclipse-platform/eclipse.platform.releng.aggregator/blob/master/cje-production/streams/repositories_java24.txt may need to s/master/R4_35_maintenance/ right?

We need to proceed with running the 4.35 YBuild from the following link: 4.35 YBuild , for which #2851 done. @HannesWell anything else we missed.

Thanks. I didn't know that the Y-build itself moves to maintenance (in platform.releng). After seeing several builds fail I wasn't quite sure if anybody was looking into it.

stephan-herrmann avatar Mar 05 '25 09:03 stephan-herrmann