peppy
peppy copied to clipboard
Project config file keyword suggestions
This idea came up when I was writing a configuration file that uses the subprojects
section. For each of my subprojects, I was defining an alternate output_dir
and sample_annotation
, but I was nesting these directly under the subproject name itself rather than within a metadata
subsection.
This led to what seemed like a failure by AttributeDict
to substitute during parse_config_file
the subproject-specific values for the general project ones that had been redefined. While subprojects
may not be a heavily used feature, I could see this being a braces/grimaces at double negative not-infrequent error. It's not a big deal if a user is being careful and first using dry-run
, but if the submission was actually done and caused way more samples than had been intended to be run to be submitted, that could cost a lot of unintended compute time/$. Either way, the user would need to be able to figure out what was wrong with the config file, which may not be entirely intuitve.
I think we've discussed keeping the config section name definition framework as flexible as possible. I definitely agree, but I think that there could be some value in, say, using knowledge of keywords like output_dir
and sample_annotation
to suggest proper placement (i.e., some sort of warning if they're not present but not placed within metadata
). The keywords that come to mind are the common metadata
ones...output_dir
, sample_annotation
, results_subdir
, submission_subdir
, pipeline_interfaces
.
to take this a step further, we may just want at some point to implement a config file parser/checker, that reports on the health of your config file. It could do a bunch of stuff like this to suggest places that you could improve. this seems like a good thing to thing about in the longer term when PEP becomes more widespread
Cool, I like the sound of that.
- looper config was moved out from pep config file. So this issue is partially outdated.
-
parser/checker of config file
- Do we still want to implement it? How should it look like? Should we have generic config schema?