snakemake
snakemake copied to clipboard
Bring --draft-notebook to basic python scripts
Is your feature request related to a problem? Please describe.
I like to write scripts interactively (as is nicely supported via --edit-notebook), but I dislike the jupyter notebook ecosystem.
Describe the solution you'd like
It would be great to have an option --draft-script or --print-snakemake-object-python that prints the python code to populate the snakemake object, as in the first cell of the notebook launched by --edit-notebook.
Describe alternatives you've considered
- Stick to the notebook
- Copy the code line from the notebook However: This feature is about convenience and iterating without barriers (such as the bulky jupyter server), so it would be great to have this slim feature.
Additional context
- I would actually like the most to work in org-mode (via org-babel), thus an equivalent
--draft-orgof--draft-notebookwould be exactly what I need. But understand that it's probably too niche to support it. The proposed feature above would serve sufficiently though.
Hi,
I would also be interested in this feature and took some time to look into it.
As a temporary workaround, I saw that there is a mock_snakemake() function in the PyPSA/pypsa-eur repository that does something along those line (link to original PR. I'm attaching below a simplified version of their helpers.py file with only that function and a small wrapper I created to do a try/except so that scripts would work both with snakemake and with a regular IPython or Python session.
@johanneskoester would there be interest for a feature like this? If so I would be happy to try to contribute. I guess that within Snakemake the mock_snakemake() would not be an optimal implementation? I can see at least two paths to achieve this:
- Use the functionality already implemented by
PythonScript, but instead ofexecute_script()afterwrite_script(), snakemake could:- drop in a text editor, or
- Add a preamble to the top of the script and save it permanently, then the user can remove this preamble when no longer needed or update it if the snakefile changes, or
- write the file in an easily discoverable location (maybe the current script directory)"hang", let the user edit the draft script, and resume at some prompt.
- Use something closer to
mock_snakemake(or my slightly modifiedmockwrap()below) so that users who want to leverage this feature have to explicitly indicate it. The advantage of this would be that the preamble is dynamically re-generated every time the script is run.
Let me know if I should explore one of the following options. I'm new to Snakemake so there might very well be something I overlooked that makes this impossible or more complicated than I thought.
Thank you!
(Fixing this might also help with #247 and #2932 re debugging snakemake script with any editor's debugger)
The modified helpers.py with only mock_snakemake
# -*- coding: utf-8 -*-
# SPDX-FileCopyrightText: : 2017-2024 The PyPSA-Eur Authors
#
# SPDX-License-Identifier: MIT
import logging
from pathlib import Path
logger = logging.getLogger(__name__)
def mockwrap(
rulename,
root_dir=None,
configfiles=None,
submodule_dir="workflow/submodules/pypsa-eur",
**wildcards,
):
try:
from snakemake.script import snakemake
except ImportError:
from helpers import mock_snakemake
snakemake = mock_snakemake(
rulename,
root_dir=root_dir,
configfiles=configfiles,
submodule_dir=submodule_dir,
**wildcards,
)
return snakemake
def mock_snakemake(
rulename,
root_dir=None,
configfiles=None,
submodule_dir="workflow/submodules/pypsa-eur",
**wildcards,
):
"""
This function is expected to be executed from the 'scripts'-directory of '
the snakemake project. It returns a snakemake.script.Snakemake object,
based on the Snakefile.
If a rule has wildcards, you have to specify them in **wildcards.
Parameters
----------
rulename: str
name of the rule for which the snakemake object should be generated
root_dir: str/path-like
path to the root directory of the snakemake project
configfiles: list, str
list of configfiles to be used to update the config
submodule_dir: str, Path
in case PyPSA-Eur is used as a submodule, submodule_dir is
the path of pypsa-eur relative to the project directory.
**wildcards:
keyword arguments fixing the wildcards. Only necessary if wildcards are
needed.
"""
import os
import snakemake as sm
from snakemake.api import Workflow
from snakemake.common import SNAKEFILE_CHOICES
from snakemake.script import Snakemake
from snakemake.settings.types import (
ConfigSettings,
DAGSettings,
ResourceSettings,
StorageSettings,
WorkflowSettings,
)
script_dir = Path(__file__).parent.resolve()
if root_dir is None:
root_dir = script_dir.parent
else:
root_dir = Path(root_dir).resolve()
user_in_script_dir = Path.cwd().resolve() == script_dir
if str(submodule_dir) in __file__:
# the submodule_dir path is only need to locate the project dir
os.chdir(Path(__file__[: __file__.find(str(submodule_dir))]))
elif user_in_script_dir:
os.chdir(root_dir)
elif Path.cwd().resolve() != root_dir:
raise RuntimeError(
"mock_snakemake has to be run from the repository root"
f" {root_dir} or scripts directory {script_dir}"
)
try:
for p in SNAKEFILE_CHOICES:
if os.path.exists(p):
snakefile = p
break
if configfiles is None:
configfiles = []
elif isinstance(configfiles, str):
configfiles = [configfiles]
resource_settings = ResourceSettings()
config_settings = ConfigSettings(configfiles=map(Path, configfiles))
workflow_settings = WorkflowSettings()
storage_settings = StorageSettings()
dag_settings = DAGSettings(rerun_triggers=[])
workflow = Workflow(
config_settings,
resource_settings,
workflow_settings,
storage_settings,
dag_settings,
storage_provider_settings=dict(),
)
workflow.include(snakefile)
if configfiles:
for f in configfiles:
if not os.path.exists(f):
raise FileNotFoundError(f"Config file {f} does not exist.")
workflow.configfile(f)
workflow.global_resources = {}
rule = workflow.get_rule(rulename)
dag = sm.dag.DAG(workflow, rules=[rule])
wc = dict(wildcards)
job = sm.jobs.Job(rule, dag, wc)
def make_accessable(*ios):
for io in ios:
for i, _ in enumerate(io):
io[i] = os.path.abspath(io[i])
make_accessable(job.input, job.output, job.log)
snakemake = Snakemake(
job.input,
job.output,
job.params,
job.wildcards,
job.threads,
job.resources,
job.log,
job.dag.workflow.config,
job.rule.name,
None,
)
# create log and output dir if not existent
for path in list(snakemake.log) + list(snakemake.output):
Path(path).parent.mkdir(parents=True, exist_ok=True)
finally:
if user_in_script_dir:
os.chdir(script_dir)
return snakemake