software-layer icon indicating copy to clipboard operation
software-layer copied to clipboard

{2023.06,a64fx}[2023a] Rebuild SciPy-bundle 2023.07 with additional patches

Open boegel opened this issue 10 months ago • 11 comments

boegel avatar Jan 25 '25 15:01 boegel

Instance eessi-bot-mc-aws is configured to build for:

  • architectures: x86_64/generic, x86_64/intel/haswell, x86_64/intel/sapphire_rapids, x86_64/intel/skylake_avx512, x86_64/amd/zen2, x86_64/amd/zen3, aarch64/generic, aarch64/neoverse_n1, aarch64/neoverse_v1
  • repositories: eessi.io-2023.06-software, eessi.io-2023.06-compat

eessi-bot[bot] avatar Jan 25 '25 15:01 eessi-bot[bot]

Instance eessi-bot-mc-azure is configured to build for:

  • architectures: x86_64/amd/zen4
  • repositories: eessi.io-2023.06-software, eessi.io-2023.06-compat

eessi-bot[bot] avatar Jan 25 '25 15:01 eessi-bot[bot]

Instance boegel-bot-deucalion is configured to build for:

  • architectures: aarch64/a64fx
  • repositories: eessi.io-2023.06-software

bot: build repo:eessi.io-2023.06-software arch:aarch64/a64fx

boegel avatar Jan 25 '25 15:01 boegel

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/a64fx from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/a64fx
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/a64fx resulted in:

    • no jobs were submitted

eessi-bot[bot] avatar Jan 25 '25 15:01 eessi-bot[bot]

Updates by the bot instance boegel-bot-deucalion (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/a64fx from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/a64fx
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/a64fx resulted in:

    • submitted job 260213, for details & status see https://github.com/EESSI/software-layer/pull/894#issuecomment-2614012150

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/a64fx from boegel

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/a64fx
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/a64fx resulted in:

    • no jobs were submitted

eessi-bot[bot] avatar Jan 25 '25 15:01 eessi-bot[bot]

New job on instance boegel-bot-deucalion for CPU micro-architecture aarch64-a64fx for repository eessi.io-2023.06-software in job dir /home/kehoste/project_dir/bot/jobs/2025.01/pr_894/260213

date job status comment
Jan 25 16:00:05 UTC 2025 submitted job id 260213 awaits release by job manager
Jan 25 16:00:52 UTC 2025 released job awaits launch by Slurm scheduler
Jan 25 16:01:55 UTC 2025 running job 260213 is running
Jan 25 17:14:34 UTC 2025 finished
:cry: FAILURE (click triangle for details)
Details
:white_check_mark: job output file slurm-260213.out
:x: found message matching FATAL:
:x: found message matching ERROR:
:white_check_mark: no message matching FAILED:
:x: found message matching required modules missing:
:x: no message matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-aarch64-a64fx-1737824923.tar.gzsize: 0 MiB (30427 bytes)
entries: 13
modules under 2023.06/software/linux/aarch64/a64fx/modules/all
EESSI-extend/2023.06-easybuild.lua
software under 2023.06/software/linux/aarch64/a64fx/software
EESSI-extend/2023.06-easybuild
other under 2023.06/software/linux/aarch64/a64fx
no other files in tarball
Jan 25 17:14:34 UTC 2025 test result
:cry: FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
:white_check_mark: job output file slurm-260213.out
:x: found message matching ERROR:
:white_check_mark: no message matching [\s*FAILED\s*].*Ran .* test case

Hmm, failed with:

^[[32mFeeding easystack file easystacks/software.eessi.io/2023.06/rebuilds/20250125-eb-4.9.4-SciPy-bundle-2023.07-bug-fixes-a64fx.yml to EasyBuild...^[[0m
^[[33mThis is a rebuild, so using --try-amend=keeppreviousinstall=True to reuse the already created directory^[[0m
ERROR: Experimental functionality. Behaviour might change/be removed later (use --experimental option to enable): Support for easybuild-ing from multiple easyconfigs based on info
rmation obtained from provided file (easystack) with build specifications.

So it's not happy because it's being fed an easystack file while the --experimental option is not enabled.

Right before, it shows the active EasyBuild configuration:

#
# Current EasyBuild configuration
# (C: command line argument, D: default value, E: environment variable, F: configuration file)
#
buildpath      (D) = /eessi_bot_job/.local/easybuild/build
containerpath  (D) = /eessi_bot_job/.local/easybuild/containers
installpath    (D) = /eessi_bot_job/.local/easybuild
repositorypath (D) = /eessi_bot_job/.local/easybuild/ebfiles_repo
robot-paths    (D) = /cvmfs/software.eessi.io/versions/2023.06/software/linux/aarch64/a64fx/software/EasyBuild/4.9.4/easybuild/easyconfigs
sourcepath     (E) = /home/kehoste/project_dir/bot/shared/easybuild/sources:
^[[32mAll set, let's start installing some software with EasyBuild v4.9.4 in ...^[[0m

Note how it's even using the default installation prefix... 🙈

Any ideas here @bedroge? This seems related to the changes in #871 which got merged via #866 (which is directly relevant here).

How come this wasn't a problem there?!

boegel avatar Jan 25 '25 20:01 boegel

Ah, ok, it has something to do with the EESSI-extend/2023.06-easybuild module, which is what's being used to configure EasyBuild, the configure_easybuild script isn't actually being used anymore.

This module isn't in place yet for A64FX, and while it's being generated by the installation script (as seen in the generated tarball), it's not correctly picked up somehow...

boegel avatar Jan 25 '25 20:01 boegel

Is there something wrong with the logic in https://github.com/EESSI/software-layer/blob/2023.06-software.eessi.io/load_eessi_extend_module.sh (called in https://github.com/EESSI/software-layer/blob/54f8d0f06de92baa08cab2913be580b7f7a13351/EESSI-install-software.sh#L254)? I can't see an issue, it should error out if the module fails to load.

Does the loading of an older EasyBuild module cause EESSI-extend to be unloaded? It shouldn't.

ocaisa avatar Jan 27 '25 08:01 ocaisa

@boegel can you retarget this pr? And see if the mentioned issue above is still a problem?

laraPPr avatar Jun 27 '25 14:06 laraPPr

Same build is done in #1027, closing this PR.

bedroge avatar Oct 12 '25 17:10 bedroge