StarCluster icon indicating copy to clipboard operation
StarCluster copied to clipboard

add "share" command (originally: NFS share the ephemeral storage by default)

Open delagoya opened this issue 14 years ago • 11 comments

Would be nice to NFS share the ephemeral drive of master host by defualt

delagoya avatar Aug 04 '11 20:08 delagoya

I'm thinking of doing a more general solution which is a new command that would NFS-share a given directory on the master across all nodes in the cluster. Something like:

$ starcluster share mycluster /mnt

We could also add an option to change the name of the mount point on the nodes:

$ starcluster share mycluster /mnt -m /master-ephemeral

There could also be an option to the cluster templates that specifies a list of directories to NFS-share in the config on startup. I think this would be more flexible and would fit a lot of different use-cases without needing to bake them all into the defaults. Does that sound reasonable?

jtriley avatar Aug 05 '11 04:08 jtriley

sounds reasonable to me. I like the idea of configurable NFS shares. Would also need to tie that into the EBS volume definitions to make sure that NFS mount directives don't clash with EBS volume directives...

delagoya avatar Aug 05 '11 12:08 delagoya

Is this being worked on? I have scripts where a fair amount of temporary data gets shared globally, so the faster I can get the nfs-shared storage to run, the faster everything goes...I would definitely use it. According to this:

http://bioteam.net/2010/07/boot-ephemeral-ebs-storage-performance-on-amazon-cc1-4xlarge-instance-types/

The fastest storage is ephemeral storage striped into raid0. That sounds like it would be the best way to create nfs-shared scratch space.

cdoersch avatar Feb 26 '13 17:02 cdoersch

I have a plugin doing this. It's not very optimal use of the ephemeral storage (since it ignores that on the non-master nodes) but does the trick for now. I prefer @jtriley's suggestion as a long term solution. Using glusterfs or something similar to use ephemeral storage of all nodes would be awesome.

class NFSSharePlugin(clustersetup.DefaultClusterSetup):

    def __init__(self, disable_threads=False, num_threads=10):
        super(ShareNFSPlugin, self).__init__(disable_threads=True, num_threads=num_threads)

    def run(self, nodes, master, user, user_shell, volumes):
        try:
            self._nodes = nodes
            self._master = master
            self._user = user
            self._user_shell = user_shell
            self._volumes = volumes

            # !!! NFS ephemeral processing directory
            self._setup_nfs(self.nodes, export_paths = ['/scratch/username'],start_server=False)
            # !!!
        finally:
            self.pool.shutdown()

    def _unmount_shares(self, node):
        cmd = "umount -fl /scratch/username"
        node.ssh.execute(cmd)

    def on_add_node(self, node, nodes, master, user, user_shell, volumes):

        self._nodes = nodes
        self._master = master
        self._user = user
        self._user_shell = user_shell
        self._volumes = volumes

        # !!! NFS ephemeral processing directory
        self._setup_nfs(nodes=[node], export_paths = ['/scratch/username'], start_server=False)
        # !!!

    def on_remove_node(self, node, nodes, master, user, user_shell, volumes):

        self._nodes = nodes
        self._master = master
        self._user = user
        self._user_shell = user_shell
        self._volumes = volumes

        # TODO: It seems we don't want to do this here since it will be handled by the built in NFS plugin (???)
        log.info("Removing node %s (%s)..." % (node.alias, node.id))

        # !!! NFS ephemeral processing directory
        self._remove_nfs_exports(node)

scrappythekangaroo avatar Apr 24 '13 13:04 scrappythekangaroo

The code scrappythekangaroo provided above does not work as is - there are several missing definitions so a user cannot just make a plugin from it without effort. It explains the idea for a programmer, yet not suitable for all users. If you choose release a plugin file that works as is I will be happy to test it.

The long term solution Justin suggested may help to take advantage of the larger ephemeral storage provided with larger AMI machines that is otherwise not used. If a system does not require persistent storage it may simplify the configuration for users who need large disk space and want to use the ephemeral storage of the master.

Also, is it possible to pull together the node ephemeral storage as well? This is hypothetical - yet may have applications - If you are paying for machines with large storage anyway, why not maximize the use of their storage?

I hope the share command solution will be considered for the next release - I will be happy to test it.

         Jacob

Jacob-Barhak avatar Jul 29 '13 20:07 Jacob-Barhak

Hi @Jacob-Barhak. What are the missing definitions? I'm glad to provide any missing code.

The above code was not tested, but rather cut from another plugin (which as been tested) that does a bunch of other stuff in addition to sharing NFS on scratch space, so maybe I missed something.

scrappythekangaroo avatar Jul 30 '13 02:07 scrappythekangaroo

Thanks @scrappythekangaroo

So there are several issue here:

  1. The user needs to import the proper libraries. I assume the line is: import starcluster.clustersetup as clustersetup
  2. ShareNFSPlugin is not defined and this line may need explanation
  3. The your code shares a hardcoded path - it will be nice to have the option to let the user to decide on the path without playing with the code. It should be easy by passing a parameter from the configuration file. This is nice to have - not critical.
  4. Minor and not critical yet important for usability - your code is free text rather than a python file with a name someone can download

If those elements are in the plugin then the user has simpler steps to follow:

  1. Download the plugin python script let say SharePath_plugin.py to the .starcluster directory
  2. Change the configuration file to include the plugin by adding the line: PLUGINS = SharePath_plugin
  3. Add the following lines to the configuration file [plugin SharePath_plugin] SETUP_CLASS = SharePath_plugin.NFSSharePlugin PATH_TO_SHARE = \SharedPathOnMaster

A solution like this will allow the user to avoid touching real code and only change a few lines in the configuration file. this is about the general desired solution using a plugin until the ideal solution of a starcluster share command is implemented.

To give you some background - I myself am interested in this plugin since I am running out of disk space when I am running simulations on the cloud. I am using c1.xlarge with 4x420gb from which I can use less than 10gb available on /home. if I could share /mnt/vol0 on the master with the other nodes, I would have sufficient disk to run my larger simulations without choking.

There are several solutions I am exploring - yours is one of them. You can find additional information on my specific issues in: http://star.mit.edu/cluster/mlarchives/1795.html http://star.mit.edu/cluster/mlarchives/1803.html

Rayson was of great help to figure what was going on - since the disk issue was hidden - I hope you can help reach an elegant way to fix it.

              Jacob

Jacob-Barhak avatar Jul 30 '13 03:07 Jacob-Barhak

Hi @Jacob-Barhak, this has taken me a while to get around to so maybe you already have a solution, but the plugin is now available here: scrappythekangaroo/StarClusterPlugins@775500618279f2fd83e2b7365b5c86cf07e11975

I implemented the plugin according to @jtriley's description of the share command:

  • mount any NFS path on master (not just scratch)
  • specify arbitrary mount point
  • optionally specify export and mount settings (otherwise defaults to starcluster default settings)

@jtriley: I think it would be nice if all starcluster volumes could have option of specifying export / mount options via the config file.

scrappythekangaroo avatar Oct 23 '13 14:10 scrappythekangaroo

Thanks @scrappythekangaroo

This works now.

Here is the tests I used with starcluster 0.94.2 on Windows 7 and PythonXY:

  1. Downloaded your repository and copied the file nfsshare.py to the .starcluster\plugins directory
  2. Defined in the config file: PLUGINS = anaconda_plugin , nfsshare ... [plugin nfsshare] SETUP_CLASS = nfsshare.NFSSharePlugin SERVER_PATH = /mnt/vol0 CLIENT_PATH = /mnt/vol0 EXPORT_NFS_SETTINGS = sync,no_root_squash,no_subtree_check,rw MOUNT_NFS_SETTINGS = vers=3,user,rw,exec,noauto
  3. Stared a 5 node cluster
  4. Uploaded MIST to the new shared directory in the master
  5. Made sure that this directory is seen from a node using starcluster sshnode mycluster node001 "ls /mnt/vol0/MIST"
  6. Ran the MIST cluster test suit installed in that directory and uses files and making sure all works well
  7. Added a node
  8. Made sure results are seen from the new node using sshmaster and ls
  9. Removed the new node

The entire process worked well.

Thanks for fixing this. there was demand for this feature on the mailing list - I will send a link there to let everybody know that a solution was found.

Many thanks and I hope others find this useful

Jacob-Barhak avatar Oct 24 '13 22:10 Jacob-Barhak

Great, thanks for testing @Jacob-Barhak

scrappythekangaroo avatar Oct 25 '13 14:10 scrappythekangaroo

Here's what I've come up with to pool ephemeral storage on a node, regardless of type. Doing the following as a plugin that runs on each node takes WAY too long...better to as I have listed to speed up node creation:

  1. Set the USERDATA_SCRIPTS = ~/.starcluster/plugins/config_node.sh

Then, in ~/.starcluster/plugins/config_node.sh:

VOLUMES="" for device in curl -s 169.254.169.254/latest/meta-data/block-device-mapping/ do if [[ $device == "ephemeral"* ]] then block=curl -s 169.254.169.254/latest/meta-data/block-device-mapping/$device | awk -F/ '{print $NF}'; if [[ -e /dev/$block ]] then pvcreate /dev/$block VOLUMES="${VOLUMES} /dev/$block" fi fi done

vgcreate vg_ephemeral $VOLUMES SIZE=vgdisplay vg_ephemeral | grep "Total PE" | awk '{print $3}' lvcreate -l $SIZE vg_ephemeral -n ephemerallv mkfs.ext3 /dev/mapper/vg_ephemeral-ephemerallv mkdir /scratch mount /dev/mapper/vg_ephemeral-ephemerallv /scratch chmod 1777 /scratch

The only problem I have with this is if I reboot the node, the storage goes away. I need to add the volume information to /etc/fstab, which I haven't gotten around to yet. Also, excuse me lack of formating. I played with the markdown for a few mins but its was worse.

golharam avatar Oct 15 '15 14:10 golharam