A standard for execution config metadata?
In one of our projects that uses nbclient, we need to allow for a per-notebook timeout limit (https://github.com/ExecutableBookProject/jupyter-cache/pull/48). For example, people may want a "default" limit that is relatively low, but have "that one notebook" that they know takes way longer to execute, so they'd like to embed that information in the notebook in a way that can be utilized by downstream tools.
Is there a jupyter standard for adding execution-level metadata to a notebook? Can folks here recommend something reasonable? It would be good to have some standard practices so we don't start moving in separate directions unintentionally
Sorry I missed the ping for this issue in my inbox.
Information related to execution defaults in the notebook document is definitely not standardized. Most ecosystems track this information outside of the notebook to pass into the execution engine. The current assumption is tools would place any metadata they wish to reuse in a namespaced subsection of the metadata section of the notebook. For execution information though this does get fractured as there's papermill, nbclient, (older nbconvert), and some other external extensions to execution models that would then be in different metadata sections. Getting all the interfaces, both offline and online systems, to adhere to a spec around these is going to be challenging. I'd be open to starting with something in a specific metadata field of the notebook and then trying to promote it into the spec. There's been more appetite for introducing such changes, but also a reluctance to move nbformat spec versions forward as it has caused some friction with downstream systems in recent bumps.
Thoughts on what fields / attributes you'd want to control at first?
Hi @MSeal , thank you for your feedback on this.
I agree with the idea of not having the execution keyword inside an extension based namespaced subsection.
We were thinking of having a generic key like execution inside which we can have a timeout key
"execution": {
"timeout":
}
and then we can have other execution related metadata as well inside this if the situation arises in the future?
There is a cell['metadata']['execution'] placehold for storing execution information today in the 4.4 spec. Today it's housing what values were during execution rather than setting enforcement limits to adhere against.
I'd open an issue in nbformat with the proposed addition of timeout to that metadata field and we can get some other core developer voiced to weigh in on the subject.
@MSeal That's great. Thanks a lot.
Hi @MSeal . Let us know when you create the issue in nbformat. Thanks.
@AakashGfude I think that he was suggesting that you do it :-)
oops. sorry about that, I will create an issue in nbformat