software-layer icon indicating copy to clipboard operation
software-layer copied to clipboard

Use ReFrame's CPU autodetect in test step

Open casparvl opened this issue 1 year ago • 41 comments

I've figured out the way we can use the CPU autodetection of ReFrame with the local spawner. We just inject the partition name for the current SLURM partition in which we are running into the ReFrame configuration file. This ensures that we get one topology file per SLURM partition that is autodetected. Note that the autodetection only needs to happen once for each architecture, and then it's there "forever" in the .reframe in the homedir of the bot.

It's good to use the autodetection, as it guarantees all the CPU info we potentially rely on in the EESSI test suite is present. This is preferable over hard-coding it, and actually recommended according to our own documentation :D

casparvl avatar Aug 22 '24 14:08 casparvl

Instance eessi-bot-mc-aws is configured to build for:

  • architectures: x86_64/generic, x86_64/intel/haswell, x86_64/intel/skylake_avx512, x86_64/amd/zen2, x86_64/amd/zen3, aarch64/generic, aarch64/neoverse_n1, aarch64/neoverse_v1
  • repositories: eessi.io-2023.06-compat, eessi-hpc.org-2023.06-software, eessi-hpc.org-2023.06-compat, eessi.io-2023.06-software

eessi-bot[bot] avatar Aug 22 '24 14:08 eessi-bot[bot]

Instance boegel-bot-deucalion is configured to build for:

  • architectures: aarch64/a64fx
  • repositories: eessi.io-2023.06-software

Instance eessi-bot-mc-azure is configured to build for:

  • architectures: x86_64/amd/zen4
  • repositories: eessi.io-2023.06-compat, eessi-hpc.org-2023.06-compat, eessi-hpc.org-2023.06-software, eessi.io-2023.06-software

eessi-bot[bot] avatar Aug 22 '24 14:08 eessi-bot[bot]

bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3

casparvl avatar Aug 22 '24 14:08 casparvl

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3 resulted in:

    • submitted job 16827, for details & status see https://github.com/EESSI/software-layer/pull/682#issuecomment-2304840397

eessi-bot[bot] avatar Aug 22 '24 14:08 eessi-bot[bot]

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3 resulted in:

    • no jobs were submitted

eessi-bot[bot] avatar Aug 22 '24 14:08 eessi-bot[bot]

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account casparvl has NO permission to send commands to the bot

New job on instance eessi-bot-mc-aws for architecture x86_64-amd-zen3 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.08/pr_682/16827

date job status comment
Aug 22 14:37:54 UTC 2024 submitted job id 16827 awaits release by job manager
Aug 22 14:38:15 UTC 2024 released job awaits launch by Slurm scheduler
Aug 22 14:44:28 UTC 2024 finished
:cry: FAILURE (click triangle for details)
Details
:white_check_mark: job output file slurm-16827.out
:white_check_mark: no message matching ERROR:
:white_check_mark: no message matching FAILED:
:white_check_mark: no message matching required modules missing:
:white_check_mark: found message(s) matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Aug 22 14:44:28 UTC 2024 test result
:cry: FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
:white_check_mark: job output file slurm-16827.out
:x: found message matching ERROR:
:white_check_mark: no message matching [\s*FAILED\s*].*Ran .* test case

eessi-bot[bot] avatar Aug 22 '24 14:08 eessi-bot[bot]

bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3

casparvl avatar Aug 22 '24 14:08 casparvl

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3 resulted in:

    • submitted job 16828, for details & status see https://github.com/EESSI/software-layer/pull/682#issuecomment-2304869433

eessi-bot[bot] avatar Aug 22 '24 14:08 eessi-bot[bot]

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account casparvl has NO permission to send commands to the bot

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3 resulted in:

    • no jobs were submitted

eessi-bot[bot] avatar Aug 22 '24 14:08 eessi-bot[bot]

New job on instance eessi-bot-mc-aws for architecture x86_64-amd-zen3 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.08/pr_682/16828

date job status comment
Aug 22 14:50:20 UTC 2024 submitted job id 16828 awaits release by job manager
Aug 22 14:50:39 UTC 2024 released job awaits launch by Slurm scheduler
Aug 22 14:51:43 UTC 2024 finished
:cry: FAILURE (click triangle for details)
Details
:white_check_mark: job output file slurm-16828.out
:white_check_mark: no message matching ERROR:
:white_check_mark: no message matching FAILED:
:white_check_mark: no message matching required modules missing:
:white_check_mark: found message(s) matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Aug 22 14:51:43 UTC 2024 test result
:cry: FAILURE (click triangle for details)
Reason
EESSI test suite was not run, test step itself failed to execute.
Details
:white_check_mark: job output file slurm-16828.out
:x: found message matching ERROR:
:white_check_mark: no message matching [\s*FAILED\s*].*Ran .* test case

eessi-bot[bot] avatar Aug 22 '24 14:08 eessi-bot[bot]

bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3

casparvl avatar Aug 22 '24 15:08 casparvl

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3 resulted in:

    • submitted job 16829, for details & status see https://github.com/EESSI/software-layer/pull/682#issuecomment-2304908380

eessi-bot[bot] avatar Aug 22 '24 15:08 eessi-bot[bot]

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account casparvl has NO permission to send commands to the bot

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3 resulted in:

    • no jobs were submitted

eessi-bot[bot] avatar Aug 22 '24 15:08 eessi-bot[bot]

New job on instance eessi-bot-mc-aws for architecture x86_64-amd-zen3 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.08/pr_682/16829

date job status comment
Aug 22 15:04:13 UTC 2024 submitted job id 16829 awaits release by job manager
Aug 22 15:05:06 UTC 2024 released job awaits launch by Slurm scheduler
Aug 22 15:06:11 UTC 2024 finished
:cry: FAILURE (click triangle for details)
Details
:white_check_mark: job output file slurm-16829.out
:white_check_mark: no message matching ERROR:
:white_check_mark: no message matching FAILED:
:white_check_mark: no message matching required modules missing:
:white_check_mark: found message(s) matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Aug 22 15:06:11 UTC 2024 test result
:cry: FAILURE (click triangle for details)
Reason
EESSI test suite produced failures.
ReFrame Summary
[ FAILED ] Ran 18/18 test case(s) from 18 check(s) (18 failure(s), 0 skipped, 0 aborted)
Details
:white_check_mark: job output file slurm-16829.out
:x: found message matching ERROR:
:x: found message matching [\s*FAILED\s*].*Ran .* test case

eessi-bot[bot] avatar Aug 22 '24 15:08 eessi-bot[bot]

bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3

casparvl avatar Aug 22 '24 15:08 casparvl

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3 resulted in:

    • submitted job 16830, for details & status see https://github.com/EESSI/software-layer/pull/682#issuecomment-2304926754

eessi-bot[bot] avatar Aug 22 '24 15:08 eessi-bot[bot]

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account casparvl has NO permission to send commands to the bot

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3 resulted in:

    • no jobs were submitted

eessi-bot[bot] avatar Aug 22 '24 15:08 eessi-bot[bot]

New job on instance eessi-bot-mc-aws for architecture x86_64-amd-zen3 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.08/pr_682/16830

date job status comment
Aug 22 15:08:48 UTC 2024 submitted job id 16830 awaits release by job manager
Aug 22 15:09:17 UTC 2024 released job awaits launch by Slurm scheduler
Aug 22 15:10:21 UTC 2024 finished
:cry: FAILURE (click triangle for details)
Details
:white_check_mark: job output file slurm-16830.out
:white_check_mark: no message matching ERROR:
:white_check_mark: no message matching FAILED:
:white_check_mark: no message matching required modules missing:
:white_check_mark: found message(s) matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Aug 22 15:10:21 UTC 2024 test result
:cry: FAILURE (click triangle for details)
Reason
EESSI test suite produced failures.
ReFrame Summary
[ FAILED ] Ran 18/18 test case(s) from 18 check(s) (18 failure(s), 0 skipped, 0 aborted)
Details
:white_check_mark: job output file slurm-16830.out
:x: found message matching ERROR:
:x: found message matching [\s*FAILED\s*].*Ran .* test case

eessi-bot[bot] avatar Aug 22 '24 15:08 eessi-bot[bot]

bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3

casparvl avatar Aug 22 '24 15:08 casparvl

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3 resulted in:

    • submitted job 16831, for details & status see https://github.com/EESSI/software-layer/pull/682#issuecomment-2304935053

eessi-bot[bot] avatar Aug 22 '24 15:08 eessi-bot[bot]

Updates by the bot instance eessi-bot-mc-azure (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3 resulted in:

    • no jobs were submitted

eessi-bot[bot] avatar Aug 22 '24 15:08 eessi-bot[bot]

Updates by the bot instance boegel-bot-deucalion (click for details)
  • account casparvl has NO permission to send commands to the bot

New job on instance eessi-bot-mc-aws for architecture x86_64-amd-zen3 for repository eessi.io-2023.06-software in job dir /project/def-users/SHARED/jobs/2024.08/pr_682/16831

date job status comment
Aug 22 15:10:50 UTC 2024 submitted job id 16831 awaits release by job manager
Aug 22 15:11:24 UTC 2024 released job awaits launch by Slurm scheduler
Aug 22 15:12:30 UTC 2024 running job 16831 is running
Aug 22 15:27:18 UTC 2024 finished
:cry: FAILURE (click triangle for details)
Details
:white_check_mark: job output file slurm-16831.out
:white_check_mark: no message matching ERROR:
:white_check_mark: no message matching FAILED:
:white_check_mark: no message matching required modules missing:
:white_check_mark: found message(s) matching No missing installations
:white_check_mark: found message matching .tar.gz created!
Artefacts
No artefacts were created or found.
Aug 22 15:27:18 UTC 2024 test result
:grin: SUCCESS (click triangle for details)
ReFrame Summary
[ PASSED ] Ran 18/18 test case(s) from 18 check(s) (0 failure(s), 0 skipped, 0 aborted)
Details
:white_check_mark: job output file slurm-16831.out
:white_check_mark: no message matching ERROR:
:white_check_mark: no message matching [\s*FAILED\s*].*Ran .* test case

eessi-bot[bot] avatar Aug 22 '24 15:08 eessi-bot[bot]

bot: build repo:eessi.io-2023.06-software arch:x86_64/generic bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/haswell bot: build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512 bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen2 bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3 bot: build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4 bot: build repo:eessi.io-2023.06-software arch:aarch64/generic bot: build repo:eessi.io-2023.06-software arch:aarch64/neoverse_n1 bot: build repo:eessi.io-2023.06-software arch:aarch64/neoverse_v1

casparvl avatar Aug 22 '24 15:08 casparvl

Updates by the bot instance eessi-bot-mc-aws (click for details)
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/generic from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/generic
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/haswell from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/haswell
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/intel/skylake_avx512 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen2 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen2
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen3 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3
  • received bot command build repo:eessi.io-2023.06-software arch:x86_64/amd/zen4 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/generic from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/generic
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/neoverse_n1 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_n1
  • received bot command build repo:eessi.io-2023.06-software arch:aarch64/neoverse_v1 from casparvl

    • expanded format: build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_v1
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/generic resulted in:

    • submitted job 16832, for details & status see https://github.com/EESSI/software-layer/pull/682#issuecomment-2304971218
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/intel/haswell resulted in:

    • submitted job 16833, for details & status see https://github.com/EESSI/software-layer/pull/682#issuecomment-2304971649
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/intel/skylake_avx512 resulted in:

    • submitted job 16834, for details & status see https://github.com/EESSI/software-layer/pull/682#issuecomment-2304972005
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen2 resulted in:

    • submitted job 16835, for details & status see https://github.com/EESSI/software-layer/pull/682#issuecomment-2304972391
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen3 resulted in:

    • submitted job 16836, for details & status see https://github.com/EESSI/software-layer/pull/682#issuecomment-2304972786
  • handling command build repository:eessi.io-2023.06-software architecture:x86_64/amd/zen4 resulted in:

    • no jobs were submitted
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/generic resulted in:

    • submitted job 16837, for details & status see https://github.com/EESSI/software-layer/pull/682#issuecomment-2304973254
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_n1 resulted in:

    • submitted job 16838, for details & status see https://github.com/EESSI/software-layer/pull/682#issuecomment-2304973624
  • handling command build repository:eessi.io-2023.06-software architecture:aarch64/neoverse_v1 resulted in:

    • submitted job 16839, for details & status see https://github.com/EESSI/software-layer/pull/682#issuecomment-2304974018

eessi-bot[bot] avatar Aug 22 '24 15:08 eessi-bot[bot]