snakemake-wrappers icon indicating copy to clipboard operation
snakemake-wrappers copied to clipboard

feat: BWA mem2 index - remove prefix parameter and determine prefix from output

Open christopher-schroeder opened this issue 2 years ago • 2 comments

Description

Remove the param prefix and detect prefix by output

QC

For all wrappers added by this PR, I made sure that

  • [x] there is a test case which covers any introduced changes,
  • [x] input: and output: file paths in the resulting rule can be changed arbitrarily,
  • [x] either the wrapper can only use a single core, or the example rule contains a threads: x statement with x being a reasonable default,
  • [x] rule names in the test case are in snake_case and somehow tell what the rule is about or match the tools purpose or name (e.g., map_reads for a step that maps reads),
  • [x] all environment.yaml specifications follow the respective best practices,
  • [x] wherever possible, command line arguments are inferred and set automatically (e.g. based on file extensions in input: or output:),
  • [x] all fields of the example rules in the Snakefiles and their entries are explained via comments (input:/output:/params: etc.),
  • [x] stderr and/or stdout are logged correctly (log:), depending on the wrapped tool,
  • [x] temporary files are either written to a unique hidden folder in the working directory, or (better) stored where the Python function tempfile.gettempdir() points to (see here; this also means that using any Python tempfile default behavior works),
  • [x] the meta.yaml contains a link to the documentation of the respective tool or command,
  • [x] Snakefiles pass the linting (snakemake --lint),
  • [x] Snakefiles are formatted with snakefmt,
  • [x] Python wrapper scripts are formatted with black.

christopher-schroeder avatar Jul 06 '22 15:07 christopher-schroeder

Wouldn't it be cleaner to infer the prefix with os.path.commonprefix()?

Btw, since you are fixing the bwa-mem2 wrapper, do you think you could look into issue #522? It seems a straightforward fix... Thanks!

fgvieira avatar Jul 18 '22 18:07 fgvieira

Like mentioned on issue #494, maybe we can define a function to infer the prefixes and add it to snakemake-wrapper-utils. Someting like:

def infer_prefix(files, suffixes, strict = True):
    prefixes = []
    suffixes = set(suffixes)
    
    for file in files:
        for suffix in suffixes:
            if file.endswith(suffix):
                prefixes.append(file[:-len(suffix)])

    if len(prefixes) != len(files) and strict:
        raise ValueError("All files must have a valid suffix.")
    if len(set(prefixes)) != 1:
        raise ValueError("All files must share common prefix.")

    return prefixes[0]

fgvieira avatar Jul 19 '22 09:07 fgvieira