1click-hpc icon indicating copy to clipboard operation
1click-hpc copied to clipboard

multiple 1click-hpc clusters with the same FSx will crash enginframe

Open rvencu opened this issue 3 years ago • 6 comments

Since the /fsx/nice location is not unique to the cluster, starting multiple clusters with the same fsx will overwrite the portal data for older clusters

rvencu avatar Jun 29 '22 23:06 rvencu

yes, right. we could consider using something like /fsx/nice/{instanceID}/ instead. what do you think ?

nicolaven avatar Jul 07 '22 13:07 nicolaven

Sounds reasonable

Obțineți Outlook pentru iOShttps://aka.ms/o0ukef


De la: Nicola Venuti @.> Trimis: Thursday, July 7, 2022 4:46:54 PM Către: aws-samples/1click-hpc @.> Cc: Richard Vencu @.>; Author @.> Subiect: Re: [aws-samples/1click-hpc] multiple 1click-hpc clusters with the same FSx will crash enginframe (Issue #27)

yes, right. we could consider using something like /fsx/nice/{instanceID}/ instead. what do you think ?

— Reply to this email directly, view it on GitHubhttps://github.com/aws-samples/1click-hpc/issues/27#issuecomment-1177642263, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AACGFM3UEVCMPEP4XE4XSZLVS3NU5ANCNFSM52HNURIA. You are receiving this because you authored the thread.Message ID: @.***>

rvencu avatar Jul 07 '22 14:07 rvencu

or what about cluster name that has to be unique anyway?

"export CLUSTER_NAME=${AWS::StackName}"

rvencu avatar Jul 07 '22 21:07 rvencu

yes this is an option because the module that install EF backup an existing EF installation. So you are fine deleting an old cluster a creating a new one with the same name mounting the same FSx.

nicolaven avatar Jul 08 '22 07:07 nicolaven

on the post.install.sh, line 62 export NICE_ROOT=$(jq --arg default "${SHARED_FS_DIR}/nice/${stack_name}" -r '.post_install.enginframe | if has("nice_root") then .nice_root else $default end' "${dna_json}")

and on 10.install.enginframe.headnode.sh line 60

    if [[ -d "${SHARED_FS_DIR}/nice/${stack_name}" ]]; then
        mv  -f "${SHARED_FS_DIR}/nice/${stack_name}" "${SHARED_FS_DIR}/nice/${stack_name}.$(date "+%d-%m-%Y-%H-%M").BAK"
    fi

then multiple clusters can live side by side with the enginframe portals intact

rvencu avatar Jul 12 '22 13:07 rvencu

yep! can you sent a PR specifically with this modification so I can incorporante

nicolaven avatar Jul 12 '22 13:07 nicolaven