WarpX icon indicating copy to clipboard operation
WarpX copied to clipboard

Installation on lxplus fails

Open sjonsell opened this issue 1 year ago • 5 comments
trafficstars

Hi,

failing to install while (rather mindlessly) trying to follow the instructions on: https://warpx.readthedocs.io/en/latest/install/hpc/lxplus.html

I discovered two things I think need to be corrected: "cp $WORK/warpx/WarpX/Tools/machines/lxplus-cern/lxplus_warpx.profile.example $WORK/lxplus_warpx.profile source $WORK/lxplus_warpx.profile"

should be:

"cp $WORK/warpx/Tools/machines/lxplus-cern/lxplus_warpx.profile.example $WORK/lxplus_warpx.profile source $WORK/lxplus_warpx.profile"

and inside lxplus_warpx.profile

"spack env create warpx-lxplus-cuda-py $WORK/WarpX/Tools/machines/lxplus-cern /spack.yaml"

should be:

"spack env create warpx-lxplus-cuda-py $WORK/warpx/Tools/machines/lxplus-cern /spack.yaml"

correcting these gets me a bit further, but then sourcing lxplus_warpx.profile

fails with:

"source lxplus_warpx.profile ==> Error: No such environment: 'warpx-lxplus-cuda-py' /usr/bin/which: no nvcc in (/afs/cern.ch/work/j/jonsell/spack/bin:/cvmfs/sft.cern.ch/lcg/releases/gcc/11.2.0-ad950/x86_64-centos7/bin:/cvmfs/sft.cern.ch/lcg/releases/binutils/2.37-4177a/x86_64-centos7/bin:/afs/cern.ch/work/j/jonsell/spack/bin:/cvmfs/sft.cern.ch/lcg/releases/gcc/11.2.0-ad950/x86_64-centos7/bin:/cvmfs/sft.cern.ch/lcg/releases/binutils/2.37-4177a/x86_64-centos7/bin:/afs/cern.ch/work/j/jonsell/spack/bin:/cvmfs/sft.cern.ch/lcg/releases/gcc/11.2.0-ad950/x86_64-centos7/bin:/cvmfs/sft.cern.ch/lcg/releases/binutils/2.37-4177a/x86_64-centos7/bin:/afs/cern.ch/user/j/jonsell/scripts:/usr/sue/bin:/usr/share/Modules/bin:/usr/condabin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/puppetlabs/bin)"

Then I am stuck.

sjonsell avatar Aug 15 '24 14:08 sjonsell

Hi @sjonsell,

Thanks for reaching out, let's see how we can get you up to speed on LXPLUS.

@lgiacome did the initial documentation for that system with me and I do not have direct access to it, so we will rely on your help. I assume that the system probably did undergo some updates since we documented this, so we need to reflect that in the docs now. Most likely it is at least CentOS 8 and a newer GCC version now.

What does

cat /etc/centos-release

return on the system?

What is in

ls /cvmfs/sft.cern.ch/lcg/releases/gcc/

these days?

The source files for the docs you linked are here and here (scripts).

ax3l avatar Aug 19 '24 23:08 ax3l

Dear Axel,

thank you very much for looking into this, your help is much appreciated.

Unfortunately:

cat /etc/centos-release cat: /etc/centos-release: No such file or directory

and I haven’t been able to find it anywhere else.

Then tried

@.*** ~]$ cat /etc/os-release

NAME="Red Hat Enterprise Linux"

VERSION="9.4 (Plow)"

ID="rhel"

ID_LIKE="fedora"

VERSION_ID="9.4"

PLATFORM_ID="platform:el9"

PRETTY_NAME="Red Hat Enterprise Linux 9.4 (Plow)"

ANSI_COLOR="0;31"

LOGO="fedora-logo-icon"

CPE_NAME="cpe:/o:redhat:enterprise_linux:9::baseos"

HOME_URL="https://www.redhat.com/"

DOCUMENTATION_URL="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9"

BUG_REPORT_URL="https://bugzilla.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 9"

REDHAT_BUGZILLA_PRODUCT_VERSION=9.4

REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"

REDHAT_SUPPORT_PRODUCT_VERSION="9.4"

Also looked at the lxplus documentation. According to it there are a few machines with CentOs 8, which can be reached by connecting to lxplus8.cern.ch .

I did this and got a similar result, still no /etc/centos-release and:

@.*** ~]$ cat /etc/os-release

NAME="Red Hat Enterprise Linux"

VERSION="8.10 (Ootpa)"

ID="rhel"

ID_LIKE="fedora"

VERSION_ID="8.10"

PLATFORM_ID="platform:el8"

PRETTY_NAME="Red Hat Enterprise Linux 8.10 (Ootpa)"

ANSI_COLOR="0;31"

CPE_NAME="cpe:/o:redhat:enterprise_linux:8::baseos"

HOME_URL="https://www.redhat.com/"

DOCUMENTATION_URL="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8"

BUG_REPORT_URL="https://issues.redhat.com/"

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"

REDHAT_BUGZILLA_PRODUCT_VERSION=8.10

REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"

REDHAT_SUPPORT_PRODUCT_VERSION="8.10"

Then (on both):

@.*** ~]$ ls /cvmfs/sft.cern.ch/lcg/releases/gcc/

10.1.0 11.2.0 12.1.0.multilib-b4b9d 4.9.2 7.3.0-90605 8.3.0.1-0a5ad 9.2.0-afc57

10.1.0-6f386 11.2.0-8a51a 13.1.0 4.9.3 7.3.0-cb1ee 8.3.0-cebb0 9.3.0

10.1.0.c82-6f386 11.2.0-ad950 13.1.0-b3d18 5.2.0 8.1.0 8.3.0-eda0e 9.3.0-467e1

10.2.0-c44b3 11.3.0 14.1.0-79c66 6.2.0 8.1.0-97bb5 8.3.1 9.3.0-6991b

10.3.0 11.3.0-ad0f5 14.2.0-2f0a0 6.2.0-2bc78 8.1.0-ffff6 9.1.0 9.3.0.c82-6991b

10.3.0-f5826 11.3.0-de683 14.2.0.fp-6ad5e 6.2.0-b9934 8.2.0 9.1.0.1

10.3.0.fp 11.3.1 4.8.1 6.3.0 8.2.0-3fa06 9.1.0.1-0b417

10.3.0.fp-95d4b 12.1.0 4.8.4 7.1.0 8.2.0-dedd0 9.1.0-f2757

11.1.0 12.1.0-2435c 4.8.5 7.2.0 8.3.0 9.2.0

11.1.0-e80bf 12.1.0-57c96 4.9.1 7.3.0 8.3.0.1 9.2.0-6bb1e

Thanks again, Svante

On 20 Aug 2024, at 01:57, Axel Huebl @.***> wrote:

Hi @sjonsellhttps://github.com/sjonsell,

Thanks for reaching out, let's see how we can get you up to speed on LXPLUS.

@lgiacomehttps://github.com/lgiacome did the initial documentation for that system with me and I do not have direct access to it, so we will rely on your help. I assume that the system probably did undergo some updates since we documented this, so we need to reflect that in the docs now. Most likely it is at least CentOS 8 and a newer GCC version now.

What does

cat /etc/centos-release

return on the system?

What is in

ls /cvmfs/sft.cern.ch/lcg/releases/gcc/

these days?

The source files for the docs you linked are herehttps://github.com/ECP-WarpX/WarpX/blob/development/Docs/source/install/hpc/lxplus.rst and here (scripts)https://github.com/ECP-WarpX/WarpX/tree/development/Tools/machines/lxplus-cern.

— Reply to this email directly, view it on GitHubhttps://github.com/ECP-WarpX/WarpX/issues/5140#issuecomment-2297715986, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BKRX7XDK7HVIWI443AYTIVDZSKBALAVCNFSM6AAAAABMSIS24SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJXG4YTKOJYGY. You are receiving this because you were mentioned.Message ID: @.***>

sjonsell avatar Aug 20 '24 12:08 sjonsell

Hi @sjonsell and @ax3l! 😄

I wrote the installation folder on lxplus a few years ago and then have not atempted it again in a while, so I don't have in mind an immediate fix but I can look into it.

Back when I wrote this, spack was not as mature as it is now so I was just using as much pre-installed software as possible from an LCG distribution but apparently this disappeared when the nodes where upgraded to Centos 8.

Lesson learned: do not rely on pre-installed software on lxplus as it can disappear.

I can try to write a new spack.yaml where I install everything from scratch so that we only rely on spack, and hopefully solve the issue.

I will keep you posted, but hopefully this won't take more than just a few days.

lgiacome avatar Aug 20 '24 15:08 lgiacome

Thank you Lorenzo,

I guess you can see my previous response to Axel.

As far as I understand the problem is that nvcc does not exist on lxplus anymore + a couple of paths with WarpX instead of warpx.

Best regards, Svante

On 20 Aug 2024, at 17:11, Lorenzo Giacomel @.***> wrote:

Hi @sjonsell https://github.com/sjonsell,

I wrote the installation folder on lxplus a few years ago and then not atempted it again in a while, so I don't have in mind an immediate fix but I can look into it.

Back when I wrote this spack was not as mature as it is now so I was just using as much pre-installed software as possible from an LCG distribution but apparently this disappeared when the nodes where upgraded to Centos 8.

Lesson learned: do not rely on pre-installed software on lxplus as it can disappear.

I can try to write a new spack.yaml where I install everything from scratch so that we only rely on spack https://spack.io/, and hopefully solve the issue.

I will keep you posted, but hopefully this won't take more than justa a few days.

— Reply to this email directly, view it on GitHub https://github.com/ECP-WarpX/WarpX/issues/5140#issuecomment-2299102351, or unsubscribe https://github.com/notifications/unsubscribe-auth/BKRX7XGS6WPMHHRUHAHG43TZSNMCJAVCNFSM6AAAAABMSIS24SVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJZGEYDEMZVGE. You are receiving this because you were mentioned.

sjonsell avatar Aug 21 '24 09:08 sjonsell

I see, so LXPLUS got a few system upgrades and uses CentOS 9 / RH 9 now, which explains why our docs are outdated.

Thanks for the help @lgiacome, let me know if I can be of any help (review, questions, etc.). For Spack, I would recommend to use a tagged release now, e.g., v0.22.1 (not its develop).

ax3l avatar Aug 26 '24 17:08 ax3l