MyST-NB icon indicating copy to clipboard operation
MyST-NB copied to clipboard

Handling notebook execution assets

Open chrisjsewell opened this issue 4 years ago • 4 comments

An example of this is ExecutableBookProject/jupyter-cache#47

Here we would also like to use the assets argument of the stage_notebook_file method: https://github.com/ExecutableBookProject/MyST-NB/blob/b64cc307b38c14db469e410434895f13f63773ff/myst_nb/cache.py#L101

Firstly, I imagine it would be extremely difficult/impossible to run actual analysis of the code, to work out what assets are required. So then the user must specify these assets explicitly.

Then are three probable approaches to achieve this:

  1. We search for a specific folder relative to the notebook, e.g. docname_assets
  • this would be an issue if multiple notebooks relied on the same asset
  1. they are specified as part of the notebook metadata
  • this would require reading the notebook before staging
  1. they are specified within the sphinx conf.py

(3) is probably the easiest option, something like:

jupyter_execute_assets = {
    "path/to/doc": [
        "doc_assets/*"
    ]
}

Here (a) all asset paths must be relative to the notebook folder, e.g. here the assets are at path/to/doc_assets, (b) glob patterns can be used. It would be somewhat similar to package_data in setup.py

chrisjsewell avatar Apr 05 '20 11:04 chrisjsewell

cc'in @choldgraf @AakashGfude

chrisjsewell avatar Apr 05 '20 11:04 chrisjsewell

Thanks @chrisjsewell

I think (3) would require quite a bit of work from the users, specifically if it is a big project and users have to track one more thing about updating conf.py whenever they introduce, remove an asset from a notebook. (2) seems like a good option in my opinion.

Thinking about it, there can be one more option where we create a variable which contains patterns which we want to ignore being added as assets like

assets_excludepatterns = ['__pycache__']

AakashGfude avatar Apr 06 '20 00:04 AakashGfude

@chrisjsewell are these generated assets or user supplied assets? If I remember correctly you are using the term artefacts for generated assets?

Can we add a directive (i.e. dependency) for including assets required for compilation? This would be easier to parse and is then easy for the user to understand which assets are required in each document?

Alternatively it could be added to the yaml at the top of the document

mmcky avatar Apr 07 '20 00:04 mmcky

are these generated assets or user supplied assets

  • assets are input external files required by the notebook to execute
  • artefacts are output external files generated by the notebook execution

Can we add a directive

Not a directive, because it needs to be known before fully parsing the document

Alternatively it could be added to the yaml at the top of the document

This is my option (2) above, so could well be the route we go down

chrisjsewell avatar Apr 07 '20 06:04 chrisjsewell