flux-docs icon indicating copy to clipboard operation
flux-docs copied to clipboard

Command Tutorials to do

Open vsoch opened this issue 2 years ago • 24 comments

  • [x] flux proxy: "I want to connect to a flux instance across clusters with ssh #192
  • [ ] flux proxy: "send commands to a flux instance you've started" #200
  • [ ] flux start: "I want to start my own flux allocation/instance to launch jobs"
  • [x] flux mini submit: "I want to run or submit a job to a flux instance" #194
    • [x] flux mini run #202
    • [x] flux job attach #204
  • [ ] flux mini alloc
  • [ ] flux mini batch - command directives, standard i/o
  • [ ] flux jobs: - -a, --format, --recursive, --filter, formats
  • [ ] flux resource list
  • [ ] flux queue list
  • [ ] flux queue drain / idle: "waiting for all your jobs to complete"
  • [ ] flux job cancel/cancelall/flux pkill - #210
  • [ ] flux job kill/killall
  • [ ] flux job pgrep/pkill
  • [ ] flux filemap Please add to this list as you see fit!

vsoch avatar Feb 07 '23 01:02 vsoch

side note on the submitting jobs faster one, as it crosses multiple commands and maybe is more about techniques, may cross the threshold into general "tutorial". I began a skeleton of a tutorial awhile back, not sure if I should divide it up or not.

chu11 avatar Feb 07 '23 01:02 chu11

@chu11 I think that might fit well for more of an "advanced tutorial" - to submit jobs more efficiently I imagine the user is pretty good at Flux but wants to be able to optimize that. We have entire sections of the docs just for Jobs so maybe as another page there? https://flux-framework.readthedocs.io/en/latest/jobs/index.html

Any part of the site we can also talk about in terms of a total re-organization - I'm not wed to anything and wanting to make it all better!

vsoch avatar Feb 07 '23 01:02 vsoch

One thing I've had in the back of my mind is that flux mini bulksubmit is generally inspired by GNU parallel, which has some great tutorials online. So there is a set of users out that aren't HPC users, but want to run things in parallel. Maybe a specific tutorial that is geared toward that class of user would be a nice intro. It could borrow from GNU parallel tutorials, but add on the benefits of having a resident resource manager handling all those jobs instead of a single process:

  • oh, you want to quickly list which jobs failed? Here you go (flux jobs -f failed)
  • oh, you want to look at the output of a failed job again? flux job attach
  • oh, you want to watch output as the jobs execute? flux mini bulksubmit --watch --progress
  • oh, you want to cancel something that hasn't run yet? flux jobs + flux job cancel or flux pkill
  • oh, you forgot a set of things and want to run them too? (just submit more work!)
  • oh, your workload needs more resources? All these examples work whether your Flux instance is running locally on your laptop with flux start -s 1 or on a cluster of 1000 nodes?
  • oh, you only have access to a cluster with a traditional resource manager? That's ok, you can start a large Flux instance as a job under many resource managers.

And so on...

grondo avatar Feb 07 '23 02:02 grondo

oh, you want to do the Flux thing? Have no fear - Fluxman is here!

image

I made him for an upcoming (TBA) talk with stable diffusion! He's kind of wonky but that's also why he's great :)

vsoch avatar Feb 07 '23 03:02 vsoch

I'm getting a strong Clamps vibe: What you want I should use Flux? The thing I use every day? The thing that I'm named after? You're a freakin' genius!

grondo avatar Feb 07 '23 03:02 grondo

@grondo you've out-culture-referenced me and I have to bite - what is Clamps? I might have seen it but I don't remember the name!

vsoch avatar Feb 08 '23 04:02 vsoch

They're coming straight toward our proximity! https://www.youtube.com/watch?v=Km_1NMUHjfA

garlick avatar Feb 08 '23 05:02 garlick

Omg! How did I miss this one! 😂 yes, clamps! Ty 🙏

vsoch avatar Feb 08 '23 05:02 vsoch

This particular cartoon was also giving me Strong Bad vibes https://youtu.be/90X5NJleYJQ (in blue!)

vsoch avatar Feb 08 '23 06:02 vsoch

I can give flux mini submit/run: "I want to run or submit a job to a flux instance" a shot! We can probably leverage the example we have of submitting jobs through CLI and/or API in our flux-workflow-examples repository. It's been a while since this repository has been updated but IMHO it would be nice to insert some of that work that folks have done in the past into our official documentation.

cmoussa1 avatar Feb 08 '23 17:02 cmoussa1

Before we pull any examples from flux-workflow-examples, let's make sure they are up to date and/or correct. Many of those examples were unfortunately out of date already a few years ago.

grondo avatar Feb 08 '23 18:02 grondo

One thing that I noticed today is that we don't have any workflow examples for users running workflows on system-instance Flux clusters (corona, tioga, etc.). We might not want this in flux-docs, since there will be a lot of LC specific questions.

But, some questions that I think users will probably have (because I have had them this week, thanks @cmoussa1 for answering on Slack):

  • How do I charge a bank that is not my default bank? (--setattr=system.bank=charge_me)
  • Or, how do I change my default bank?
  • How do I view the banks I have access to on a given machine? (Slurm: sacctmgr list user whoami witha Flux: flux account view-user whoami)
  • What all can I set with --setattr?

These are just things I couldn't find in our docs or by running --help.

wihobbs avatar Sep 12 '23 18:09 wihobbs

Also @vsoch love the Fluxman graphic :)

wihobbs avatar Sep 12 '23 18:09 wihobbs

Or, how do I change my default bank?

This can be done with flux account edit-user moussa1 --default-bank=my_new_default_bank

However, I think this command is reserved for administrators (I don't think we don't want users to be able to update their default bank to one they should not have access to on a system instance of Flux).

cmoussa1 avatar Sep 12 '23 18:09 cmoussa1

@wihobbs we could definitely add them here, if others think that is appropriate! We already have a few under https://flux-framework.readthedocs.io/en/latest/tutorials/lab/index.html.

vsoch avatar Sep 12 '23 18:09 vsoch

How do I charge a bank that is not my default bank? (--setattr=system.bank=charge_me)

system. is the default for --setattr. so more simply this is just --setattr=bank=charge_me.

grondo avatar Sep 12 '23 21:09 grondo

@wihobbs actually adding to the cheatsheet might be a good idea.

https://github.com/flux-framework/cheat-sheet

Although bank stuff is a tad lab specific so maybe needs a comment in there with a caveat. Perhaps, "for sites that use flux-accounting and use banks for charging time" or something.

chu11 avatar Sep 12 '23 21:09 chu11

@chu11 This is new to me...and I'm in love, this is great! Tagging @vsoch too, this is so nice and concise.

Maybe we want a cheat sheet for LC users...similar to what Ines puts in Ramblings or we put in the Staff documentation. I'm sure if we think about it for a bit, there are other things that only LC users would want to have at their fingertips. Heck, maybe there's even a use case for a "I'm at LC" switch on the cheatsheet that adds "how to charge a bank," MPI flags when we have those, etc. to the existing cheat sheet.

wihobbs avatar Sep 12 '23 23:09 wihobbs

I could open a separate issue for this...don't want to take away focus from the in-depth tutorials named on here which I think would also be very useful.

wihobbs avatar Sep 12 '23 23:09 wihobbs

A cheat sheet, you say? :rofl:

https://flux-framework.org/cheat-sheet/

vsoch avatar Sep 12 '23 23:09 vsoch

@wihobbs I think you and me think quite alike. Which is probably equally fantastic and dangerous. I mean after all, I ate flux bird in his radish form. I'm a monster. :japanese_ogre:

vsoch avatar Sep 12 '23 23:09 vsoch

And here is the repository - https://github.com/flux-framework/cheat-sheet/ - it's already designed so that data (from yaml) renders into the UI, automation via GitHub pages, so we could do a tweak to allow more than one cheat sheet (e.g., specific to a system) and then have links to all of them somewhere.

vsoch avatar Sep 12 '23 23:09 vsoch

I'm making a note that for NEWT documentation day (12/15) I plan to knock out one (maybe two) of these outstanding todos. Any suggestions on priority/where help is needed?

wihobbs avatar Dec 06 '23 18:12 wihobbs

A lot of these seem to be outdated. We should make a pass through this list before starting any work

grondo avatar Dec 06 '23 18:12 grondo