oci-hpc
oci-hpc copied to clipboard
Terraform examples for deploying HPC clusters on OCI
hi~ Team, I provising HPC Cluster using custom-image Not working , I can not connect to HPC BM Bastion Server is not changed /etc/hosts
When editing a stack and applying it, the process fails, because the instance pools are blocking an update: ``` Error: 409-Conflict, The Instance Configuration ocid1.instanceconfiguration.oc1.phx.aaaaaaaabycbnzxq4uskt4f7mklp4g4fcqk4m42aabj2r2fkchjygppdudua is associated to one or...
this setting: data:image/s3,"s3://crabby-images/8e5f2/8e5f27689ff6e4a74e4158be529d16f675d5441a" alt="image" leads to this error: ``` null_resource.cluster (remote-exec): TASK [safe_yum : yum first try] ************************************************ null_resource.cluster (remote-exec): [0;32mok: [inst-inls6-rich-maggot][0m null_resource.cluster (remote-exec): [0;32mok: [rich-maggot-bastion][0m null_resource.cluster (remote-exec): TASK [nfs-client :...
Hello, We have a problem with a stack creation and Oracle engineering team suggests us reach out to the team supporting this Terraform stack (https://github.com/oracle-quickstart/oci-hpc). The logs are: 2022/07/05 10:25:07[TERRAFORM_CONSOLE]...
Many errors occur during provisioning. It is as below. -- case-1) oci_core_cluster_network.cluster_network[0]: Still creating... [1h5m20s elapsed] oci_core_cluster_network.cluster_network[0]: Still creating... [1h5m30s elapsed] oci_core_cluster_network.cluster_network[0]: Still creating... [1h5m40s elapsed] oci_core_cluster_network.cluster_network[0]: Still creating... [1h5m50s...
`conf/variables.tpl` contains ``` variable "marketplace_version_id" { type = map(string) default = { "HPC_OL7" = "OracleLinux-7-OCA-RHCK-OFED-23.10-2.1.3.1-2024.03.15-0" "HPC_OL8" = "OracleLinux-8-OCA-RHCK-OFED-23.10-2.1.3.1-2024.03.15-0" "GPU_OL7_CUDA12.2" = "OracleLinux-7-OCA-RHCK-OFED-23.10-2.1.3.1-GPU-535-CUDA-12.2-2024.03.15-0" "GPU_OL8_CUDA12.2" = "OracleLinux-8-OCA-RHCK-OFED-23.10-2.1.3.1-GPU-535-CUDA-12.2-2024.03.15-0" "GPU_OL7_CUDA12.4" = "OracleLinux-7-OCA-RHCK-OFED-23.10-2.1.3.1-GPU-535-CUDA-12.4-2024.03.15-0" "GPU_OL8_CUDA12.4" = "OracleLinux-8-OCA-RHCK-OFED-23.10-2.1.3.1-GPU-535-CUDA-12.4-2024.03.15-0"...
LDAP install failed in a stack deployment recently due to the package being downloaded from CentOS8 which is now EOL/EOS. We decided to download the legacy file and place it...
There is a legacy tuned profile for RHEL7 which is not needed. This was confirmed in the slack channel.