ibm-spectrum-scale-install-infra icon indicating copy to clipboard operation
ibm-spectrum-scale-install-infra copied to clipboard

What is purpose of var/scale_clusterdefinition.json?

Open troppens opened this issue 5 years ago • 9 comments

I stumbled over file var/scale_clusterdefinition.json and I am wondering if this is a left over that can be removed.

I successfully installed and configured a single node Spectrum Scale cluster by customizing the files hosts and group_vars/all. The content in var/scale_clusterdefinition.json describes a different cluster configuration that does not have any impact on my installation.

My suggestion would be to either remove this file or to provide proper documentation on purposae and usage.

My hosts (customized by me):

# hosts:
[cluster]
10.1.1.20  scale_cluster_quorum=true scale_cluster_manager=true scale_cluster_gui=false

My group_vars/all (customized by me):

# group_vars/all:
---
scale_storage:
  - filesystem: gpfs
    automaticMountOption: treu
    disks:
      - device: /dev/sdb
        servers: 10.1.1.20

Content in var/scale_clusterdefinition.json (as in GitHub. Unchanged):

{
  "node_details": [
    {
      "fqdn" : host-vm1,
      "ip_address" : 192.168.100.101,
      "is_nsd_server": True,
      "is_quorum_node" : True,
      "is_manager_node" : True,
      "is_gui_server" : False,
    },
    {
      "fqdn" : host-vm2,
      "ip_address" : 192.168.100.102,
      "is_nsd_server": True,
      "is_quorum_node" : True,
      "is_manager_node" : True,
      "is_gui_server" : False,
    },
  ],
  "scale_storage":[
    {
      "filesystem": "fs1",
      "blockSize": 4M,
      "defaultDataReplicas": 1,
      "defaultMountPoint": "/mnt/fs1",
      "disks": [
       {
        "device": "/dev/sdd",
        "nsd": "nsd1",
        "servers": "host-vm1"
       },
       {
        "device": "/dev/sdf",
        "nsd": "nsd2",
        "servers": "host-vm1,host-vm2"
       }
      ]
    }
  ]
}

troppens avatar Apr 06 '20 12:04 troppens

@troppens Thanks for your feedback . currently this JSON i.e scale_clusterdefinition.json is mainly for cloud AWS deployment and this file will automatically get generated for cloud deployment through terraform resource provision but we can use JSON for any scale supported deployment. so we have just added as a template if someone is planning to use JSON inventory. we are planning to add complete information for JSON inventory method in the README.

Current playbook supports 2 different methods to define GPFS cluster node inventory and user can use any of the following two supported methods,

  1. JSON inventory method i.e scale_clusterdefinition.json then we can directly execute playbook through JSON inventory i.e ansible-playbook cloud_playbook.yml
  2. host and group_vars inventory method , ./ansible.sh or ansible-playbook -i hosts playbook.yml

rajan-mis avatar Apr 06 '20 13:04 rajan-mis

@rajan-mis Thanks. I was also wondering about cloud_playbook. It should be covered in the README, too.

troppens avatar Apr 06 '20 13:04 troppens

@troppens Yes this will also get updated and we are planning to merge cloud_playbook.yml into playbook.yml . thanks

rajan-mis avatar Apr 06 '20 13:04 rajan-mis

@rajan-mis sorry I did not understand merge into playbook.yml? I was under the impression of this structure, "playbooks/cloud_playbook.yml" and someother example playbooks probably could be put in to it.

sasikeda avatar Apr 06 '20 13:04 sasikeda

@sasikeda We are thinking to keep only one main playbook i.e playbook.yml that should work with both JSON or hosts/group_vars inventory method. thanks

rajan-mis avatar Apr 06 '20 13:04 rajan-mis

Throwing this out there.. What about providing a docs/examples/ directory with various samples files, including a general playbook.yml, then our instructions would be:

  • cd <repo>
  • Copy a sample playbook to the repo top directory cp docs/examples/playbook.yml .
  • modify playbook.yml to match your environment

What's the difference you may ask? When users have deployed and cloned this repo, and if they make changes to the tracked file playbook.yml , they may or may not be able to rebase their clone with changes that we merge into master. (i.e git pull may complain about conflicts)

So what would naturally happen, is they would maintain their own set of copied playbook.yml files anyway (cp playbook.yml myplaybook.yml) so that their changes are untracked by git...

whowutwut avatar Apr 06 '20 15:04 whowutwut

In general files that have to be customized should not be checked in. They should go to a separate directory as @whowutwut suggests, or they should have a suffix, e.g. playbook.yml.smp. With the suffix they could be placed in the right directory. That would make it easier for Ansible newbies to put the respective files in the respective directories.

I would also suggest to have files like that will be customized excluded in .gitignore to prevent accidentally check-in.

troppens avatar Apr 06 '20 18:04 troppens

Thinking further about my previous comment, I suggest to consider how users consume the Ansible tooling provided by this project. I did a quick test and now suggest to package this project slightly different.

Let's say that I already have a playbook to manage my environment. Then I would like to include the tooling provided by this project into my already existing playbook instead of writing a new playbook.

Example project:

[root@origin mysite]# ls
group_vars  hosts  site.yml

Example inventory file:

[root@origin mysite]# cat hosts
# hosts:
[cluster]
10.1.1.20  scale_cluster_quorum=true scale_cluster_manager=true scale_cluster_gui=false

Example variable definition:

[root@origin mysite]# cat group_vars/all
# group_vars/all:
---
scale_storage:
  - filesystem: gpfs
    automaticMountOption: treu
    disks:
      - device: /dev/sdb
        servers: 10.1.1.20

Example playbook:

[root@origin mysite]# cat site.yml
---
- hosts: all
  roles:
    - setup_ssh

- hosts: cluster
  vars:
    - scale_version: 5.0.4.1
    - scale_install_localpkg_path: /software/Spectrum_Scale_Developer-5.0.4.1-x86_64-Linux-install
  roles:
     - spectrum_scale_core/precheck
     - spectrum_scale_core/node
     - spectrum_scale_core/cluster

The example above illustrates how the tooling could be integrated and combined with other roles.

To get this running I did the following:

[root@origin ~]# cd /etc/ansible/roles/

[root@origin roles]# ls
setup_ssh

[root@origin roles]# git clone https://github.com/IBM/ibm-spectrum-scale-install-infra.git /etc/ansible/roles/spectrum-scale
Cloning into '/etc/ansible/roles/spectrum-scale'...
remote: Enumerating objects: 40, done.
remote: Counting objects: 100% (40/40), done.
remote: Compressing objects: 100% (34/34), done.
remote: Total 458 (delta 15), reused 18 (delta 6), pack-reused 418
Receiving objects: 100% (458/458), 107.64 KiB | 577.00 KiB/s, done.
Resolving deltas: 100% (187/187), done.

[root@origin roles]# ln -s spectrum-scale/roles/core/ spectrum_scale_core

[root@origin roles]# ls
setup_ssh  spectrum-scale  spectrum_scale_core

[root@origin roles]#

The directory /etc/ansbile/roles now includes two roles:

  • setup_ssh
  • spectrum_scale_core

I have referenced both roles in the playbook site.yml above.

Now I can run my playbook to configure ssh and Spectrum Scale:

[root@origin roles]# cd ~/mysite/

[root@origin mysite]# ansible-playbook -i hosts site.yml

PLAY [all] *************************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************
ok: [10.1.1.20]

TASK [setup_ssh : Disable ssh strict host key checking] ****************************************************************
ok: [10.1.1.20]

PLAY [cluster] *********************************************************************************************************

TASK [Gathering Facts] *************************************************************************************************
ok: [10.1.1.20]

TASK [common : common | Find available scale version] ******************************************************************
skipping: [10.1.1.20]

...

Based on that I would suggest to put each role into a different GitHub repo. Then users could clone the roles that they desire to use into a directory that is in the Ansbile role search path.

The separation of roles would also enable the next integration step: Make all roles available via Ansible Galaxy. IBM already has an account and it would be nice to see there roles for Spectrum Scale soon.

troppens avatar Apr 07 '20 07:04 troppens

There's a lot a valuable information in this thread already. Let me add my comments:

  • The discussion has somewhat drifted from the original question. I'd suggest to either change the subject of this issue or close it and open a new one instead. To me, the main topic appears to be "How do users consume this project?".

  • I agree that users should not be instructed to modify files which are under version control. This would e.g. prevent updates through git pull, just to name one example.

  • While it's nice to provide sample playbook(s) I don't think that this is how majority of users will actually consume this project. The purpose of sample inventory and playbook files is to complement the documentation. I guess that users will fall into one of these two categories:

    • Users without Ansible skills will rely on the CLI provided by the Installation Toolkit. They might not even care about the inner workings of the Toolkit, as long as it gets the job done...
    • Users with Ansible skills will most likely have existing inventory / playbooks / roles already to manage their infrastructure. They'll want to integrate the roles provided by this project into their existing setup, which might be really large and complex already. I doubt that users will modify the samples and add (read: duplicate) all their existing logic for adding Spectrum Scale to their automation workflow...
  • @troppens While creating symbolic links actually works as you describe above, my experience is that users will prefer using the ANSIBLE_ROLES_PATH configuration option for making Ansible search additional directories for roles. I'd think that most users simply add /path/to/ibm-spectrum-scale-install-infra/roles/ to this path and start adding the roles to their existing playbooks.

  • My experience is that users prefer consuming Ansible content from Galaxy — users prefer ansible-galaxy install over git clone. There's a discussion and some information on Galaxy support in #172.

  • I would advise against splitting up this project into multiple repos, as it will make it more complex to import the required parts into an existing setup. Ansible Content Collections allow publishing multiple roles (along with playbooks, modules and plugins) within a single "container". There are good examples of IBM content collections available.

acch avatar Aug 05 '20 08:08 acch