papermill
papermill copied to clipboard
New feature: Inspecting notebook with different tags (besides parameters)
I'm liking the inspect_notebook function but for our use case we would like to inspect notebook cells that are tagged with something else besides parameters
inspect_notebook documentation
we can add another function argument in inspect_notebook (like tag) and default it to parameters
https://github.com/nteract/papermill/blob/main/papermill/inspection.py#L102-L123
def inspect_notebook(notebook_path, tag="parameters", parameters=None):
...
also add a tag argument (default parameters) and populate _infer_parameters with tag
parameter_cell_idx = find_first_tagged_cell_index(nb, "parameters")
https://github.com/nteract/papermill/blob/main/papermill/inspection.py#L37
and find_first_tagged_cell_index will return the first cell it finds with the specified tag value
What do you think? I can go ahead and make a PR if this would be used in the future.
My biggest reservation about this is the fact that a lot of UIs (thinking about nteract and Jupyter Notebook) special case their client-side experience for working with parameterized cells by checking for the "parameters" tag. Allowing arbitrary labels here will break that convention for front-end clients.
I'll let other folks chime in on other motivations for this.
@captainsafia thanks for the feedback 👍
inspection of the notebook will always default to inspecting cells tagged with parameters but we just want to be given the option to inspect other cells as well
the functionality is already there and would require a small tweak:
parameter_cell_idx = find_first_tagged_cell_index(nb, "parameters")
it'll ensure backwards compatibility because it will the value should default to parameters and ideally won't affect other parts of the library that use the find_first_tagged_cell_index function
let me know what you and your team think, thanks 👍
@willingc @rgbkrk Thoughts on this?
@DustinKLo walk me through what the next feature is you would need if we inspected cells that are tagged something other than parameters. What would you want to do with that cell or notebook?
Historically, we've tried to keep Papermill's scope conservative and focused on the execution of parameterized notebooks. Like @rgbkrk, I would be interested in @DustinKLo's use cases.
I'm a bit hesitant to open up inspect to things beyond parameters. I could see how inspecting cells for "tags" may be useful. It's not clear from the messages above if the expectation would be execution based on tags too. Use cases would make it a bit easier to determine whether to add the functionality in papermill or somewhere else. :sunny:
thanks for your input 👍
We simply just want to inspect cells that are tagged with something else besides parameters
We use the inspect_notebook function to extract extra variables to build the proper configuration files in our science data system
It would be nice to separate cells to make it a little cleaner/organized
no other use cases besides inspecting cells 👍
The use case itself sounds like you want to inspect cells to create configuration files -- is that the best way to put it? These are used by a separate system? Does the notebook itself still need to run or are you just using papermill to do inspection?
@rgbkrk yup you got it correct 👍 we're using papermill for inspection to build the jobs, the actual execution of the notebooks are done separately (in a docker container)