configs icon indicating copy to clipboard operation
configs copied to clipboard

Restructure pipeline configs to improve discoverability

Open edmundmiller opened this issue 9 months ago • 8 comments

I think restructuring the configs across the board like this would make it easier for new maintainers to find configs for pipeline for their institutions.

edmundmiller avatar Mar 05 '25 02:03 edmundmiller

I support this! At least for me is easy to see now.

lpantano avatar Mar 05 '25 02:03 lpantano

This might be a good Hackathon task thinking about it (just moving stuff around), but we would need to coordinate with @mashehu as will need to update how the website parses this repo

jfy133 avatar Mar 05 '25 05:03 jfy133

Yep, please don't move anything without updating the website at the same time. My hope is that surfacing them on the website with the latest website upgrade should solve already a lot of the discoverability.

mashehu avatar Mar 05 '25 09:03 mashehu

Can I get an overview of what you're doing that improves discoverability?

Each config and related material (docs, pipeline) is put in a folder with the profile name? I think that's better. It'll allow better dynamic inclusion too.

I think the README should be right next to the config.

What other discoverability issues are there?

  • Key words? Hostnames?
  • What the config supports in plain language?

mahesh-panchal avatar Jul 03 '25 07:07 mahesh-panchal

Hey @mahesh-panchal, sorry for missing this.

I think it's also around the pipeline-specific configs and the discoverability of that.

I think a lot of people look for the examples, and then they find them hidden under layers and layers of directories and spread all over the place.

The way this ended up structured is that we're organizing based on the "types" of things. We're matching "this is a doc," "this is a pipeline config," and "this is an HPC system config."

Versus "Here's all of this specific HPC system's docs and configs in one directory".

edmundmiller avatar Aug 29 '25 12:08 edmundmiller

Current Structure Analysis

  • conf/uppmax/README.md ✅ (already exists)
  • conf/uppmax/uppmax.config (main config)
  • conf/uppmax/pipeline/ampliseq.config (ampliseq-specific config)
  • conf/uppmax/pipeline/ampliseq.md (ampliseq-specific docs)
  • conf/pipeline/sarek/uppmax.config (sarek config in old location)
  • conf/pipeline/ampliseq/uppmax.config (duplicate of ampliseq config)
  • docs/pipeline/sarek/uppmax.md (sarek docs in old location)
  • docs/pipeline/ampliseq/uppmax.md (duplicate of ampliseq docs)

Target Structure

conf/uppmax/
├── nextflow.config                    # Renamed from uppmax.config
├── README.md                         # Keep existing
└── pipelines/                        # Rename from 'pipeline'
    ├── ampliseq/
    │   ├── nextflow.config           # Rename from ampliseq.config
    │   └── README.md                 # Rename from ampliseq.md
    └── sarek/
        ├── nextflow.config           # Move from conf/pipeline/sarek/uppmax.config
        └── README.md                 # Move from docs/pipeline/sarek/uppmax.md

Actions Required

  1. Rename main config: conf/uppmax/uppmax.config → conf/uppmax/nextflow.config
  2. Rename pipeline directory: conf/uppmax/pipeline/ → conf/uppmax/pipelines/
  3. Create sarek subdirectory: conf/uppmax/pipelines/sarek/
  4. Move sarek files:
  • conf/pipeline/sarek/uppmax.config → conf/uppmax/pipelines/sarek/nextflow.config
  • docs/pipeline/sarek/uppmax.md → conf/uppmax/pipelines/sarek/README.md
  1. Rename ampliseq files:
  • conf/uppmax/pipelines/ampliseq.config → conf/uppmax/pipelines/ampliseq/nextflow.config
  • conf/uppmax/pipelines/ampliseq.md → conf/uppmax/pipelines/ampliseq/README.md
  1. Remove duplicates:
  • Delete conf/pipeline/ampliseq/uppmax.config

  • Delete docs/pipeline/ampliseq/uppmax.md

    This creates a consistent structure where all uppmax files are co-located under conf/uppmax/ with uniform naming (nextflow.config, README.md)

edmundmiller avatar Aug 29 '25 13:08 edmundmiller

I don't think it's just the discoverability that would benefit from the new structure that @edmundmiller but for creation and maintenance, which the website would not help.

Alongside the examples that @edmundmiller mentioned: everything is self contained in a single directory - it's easier to cross reference your 'own' configs related to the same infra with fewer clicks and to modify in the same way (e.g. imagine doing a global search/replace in an IDE across all configs - the restriction pattern will be a single directory Vs a more complex pattern with the current system).

It also would make initialisation of templates for configs etc much simpler in the future - something that the config builder that @mirpedrol has taken over would potentially benefit from.

So /approve (if that would work here... Maybe this could be an RFC @edmundmiller )

Also I guess you could probably write a little script to update the structure across all the configs @edmundmiller ?

jfy133 avatar Aug 30 '25 07:08 jfy133

Edit: oh you did that already 🤣

jfy133 avatar Aug 30 '25 07:08 jfy133