software-layer icon indicating copy to clipboard operation
software-layer copied to clipboard

Add new init scritps for new initialization module

Open MaKaNu opened this issue 1 year ago • 6 comments

This is a followup PR for https://github.com/EESSI/software-layer/pull/667

After successful merge of #667 I have some test cases prepared, to test against the different shell.

For the moment, I struggle with an implementation for csh. It only seems to happen when I try to load the module inside csh. Error response of csh is not the brightest I've seen.

MaKaNu avatar Aug 12 '24 12:08 MaKaNu

Instance eessi-bot-mc-aws is configured to build for:

  • architectures: x86_64/generic, x86_64/intel/haswell, x86_64/intel/skylake_avx512, x86_64/amd/zen2, x86_64/amd/zen3, aarch64/generic, aarch64/neoverse_n1, aarch64/neoverse_v1
  • repositories: eessi.io-2023.06-compat, eessi-hpc.org-2023.06-software, eessi-hpc.org-2023.06-compat, eessi.io-2023.06-software

eessi-bot[bot] avatar Aug 12 '24 12:08 eessi-bot[bot]

Instance boegel-bot-deucalion is configured to build for:

  • architectures: aarch64/a64fx
  • repositories: eessi.io-2023.06-software

Instance eessi-bot-mc-azure is configured to build for:

  • architectures: x86_64/amd/zen4
  • repositories: eessi.io-2023.06-compat, eessi-hpc.org-2023.06-compat, eessi-hpc.org-2023.06-software, eessi.io-2023.06-software

eessi-bot[bot] avatar Aug 12 '24 12:08 eessi-bot[bot]

Ok it expanded the PS1 inside the git commit message :smiling_face_with_tear:

MaKaNu avatar Aug 12 '24 14:08 MaKaNu

module load $LMOD_SYSTEM_DEFAULT_MODULES is a little brittle as this won't work if there is more than module in the variable (I think). It's probably best to take the official route as per the Lmod docs (shell dependent):

Implemented in last commit. Works as expected, with exclusion for csh.

MaKaNu avatar Aug 13 '24 21:08 MaKaNu

Pushed tests are expected to fail for now (getting interesting after merge of #667)

MaKaNu avatar Aug 16 '24 15:08 MaKaNu

We never included PS1 in #667 so your tests will fail for that currently. If you are keen to have that, it could be part of this PR or a follow-up after this is merged (this PR has higher priority in my opinion)

ocaisa avatar Sep 05 '24 13:09 ocaisa

We never included PS1 in #667 so your tests will fail for that currently. If you are keen to have that, it could be part of this PR or a follow-up after this is merged (this PR has higher priority in my opinion)

Yes I was following the complete process and agree. The commits are reverted and can easily be reactivated later.

MaKaNu avatar Sep 06 '24 16:09 MaKaNu

I still struggle with correctly testing the csh. It seems the script is working correctly, but redirecting is a pain and I rely on redirecting in the tests a lot.

MaKaNu avatar Sep 06 '24 17:09 MaKaNu

No matter If I use x86_64/generic or the correct arch of the system, the Path will always point on the correct arch. Fixed this now by testing against a simple regex group.

MaKaNu avatar Sep 11 '24 17:09 MaKaNu

bot: build repo:eessi.io-2023.06-software arch:x86_64/generic

ocaisa avatar Sep 12 '24 16:09 ocaisa

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/generic from ocaisa

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/generic
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/generic resulted in:

    • submitted job 18438, for details & status see https://github.com/EESSI/software-layer/pull/668#issuecomment-2346791729

eessi-bot[bot] avatar Sep 12 '24 16:09 eessi-bot[bot]

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/generic from ocaisa

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/generic
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/generic resulted in:

    • no jobs were submitted

eessi-bot[bot] avatar Sep 12 '24 16:09 eessi-bot[bot]

New job on instance eessi-bot-mc-aws for architecture x86_64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.09/pr_668/18438

date job status comment
Sep 12 16:50:38 UTC 2024 submitted job id 18438 awaits release by job manager
Sep 12 16:51:12 UTC 2024 released job awaits launch by Slurm scheduler
Sep 12 16:57:14 UTC 2024 running job 18438 is running
Sep 12 17:16:35 UTC 2024 finished
:cry: FAILURE (click triangle for details)
Details
:white_check_mark: job output file slurm-18438.out
:white_check_mark: no message matching ERROR:
:white_check_mark: no message matching FAILED:
:white_check_mark: no message matching required modules missing:
:white_check_mark: found message(s) matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Sep 12 17:16:35 UTC 2024 test result
:grin: SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 18/18 test case(s) from 18 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
:white_check_mark: job output file slurm-18438.out
:white_check_mark: no message matching ERROR:
:white_check_mark: no message matching [\s*FAILED\s*].*Ran .* test case

eessi-bot[bot] avatar Sep 12 '24 16:09 eessi-bot[bot]

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account ocaisa has NO permission to send commands to the bot

You need something similar to https://github.com/EESSI/software-layer/blob/2023.06-software.eessi.io/install_scripts.sh#L94-L97 so that the new scripts are actually deployed

ocaisa avatar Sep 12 '24 17:09 ocaisa

bot: build repo:eessi.io-2023.06-software arch:x86_64/generic

ocaisa avatar Sep 12 '24 20:09 ocaisa

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/generic from ocaisa

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/generic
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/generic resulted in:

    • submitted job 18439, for details & status see https://github.com/EESSI/software-layer/pull/668#issuecomment-2347211038

eessi-bot[bot] avatar Sep 12 '24 20:09 eessi-bot[bot]

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/generic from ocaisa

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/generic
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/generic resulted in:

    • no jobs were submitted

eessi-bot[bot] avatar Sep 12 '24 20:09 eessi-bot[bot]

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account ocaisa has NO permission to send commands to the bot

New job on instance eessi-bot-mc-aws for architecture x86_64-generic for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.09/pr_668/18439

date job status comment
Sep 12 20:48:12 UTC 2024 submitted job id 18439 awaits release by job manager
Sep 12 20:48:54 UTC 2024 released job awaits launch by Slurm scheduler
Sep 12 20:54:57 UTC 2024 running job 18439 is running
Sep 12 21:14:17 UTC 2024 finished
:grin: SUCCESS (click triangle for details)
Details
:white_check_mark: job output file slurm-18439.out
:white_check_mark: no message matching ERROR:
:white_check_mark: no message matching FAILED:
:white_check_mark: no message matching required modules missing:
:white_check_mark: found message(s) matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
eessi-2023.06-software-linux-x86_64-generic-1726174458.tar.gzsize: 0 MiB (756 bytes)
entries: 5
modules under 2023.06/software/linux/x86_64/generic/modules/all
no module files in tarball
software under 2023.06/software/linux/x86_64/generic/software
no software packages in tarball
other under 2023.06/software/linux/x86_64/generic
2023.06/init/lmod/bash
2023.06/init/lmod/csh
2023.06/init/lmod/fish
2023.06/init/lmod/ksh
2023.06/init/lmod/zsh
Sep 12 21:14:17 UTC 2024 test result
:grin: SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 18/18 test case(s) from 18 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
:white_check_mark: job output file slurm-18439.out
:white_check_mark: no message matching ERROR:
:white_check_mark: no message matching [\s*FAILED\s*].*Ran .* test case
Sep 12 22:31:36 UTC 2024 uploaded transfer of eessi-2023.06-software-linux-x86_64-generic-1726174458.tar.gz to S3 bucket succeeded

eessi-bot[bot] avatar Sep 12 '24 20:09 eessi-bot[bot]

bot: build repo:eessi.io-2023.06-software arch:x86_64/generic

MaKaNu avatar Sep 12 '24 21:09 MaKaNu

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • account MaKaNu has NO permission to send commands to the bot

eessi-bot[bot] avatar Sep 12 '24 21:09 eessi-bot[bot]

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • account MaKaNu has NO permission to send commands to the bot

eessi-bot[bot] avatar Sep 12 '24 21:09 eessi-bot[bot]

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account MaKaNu has NO permission to send commands to the bot

Label bot:deploy has been set by user ocaisa, but this person does not have permission to trigger deployments

Staging PR merged, this is now in the wild!

ocaisa avatar Sep 12 '24 22:09 ocaisa