enhancement-proposals
enhancement-proposals copied to clipboard
[proposal] Jupyter Extension Generator
This proposes an extension scaffolding/generation approach, initially based on creating a new yeoman generator-
package to be distributed on npm
.
Request for review (based on previous discussion): @ellisonbg @jdfreder @sccolbert @rgbkrk @damianavila
Please ping others that may be interested! I will post this to the jupyter and ipython lists as well!
@bollwyvl thanks for putting this together!
- I am very much in favor of using something like yeoman for creating Jupyter related project templates.
- Right now, the word "extension" is very overloaded in the project. With the move to phosphor, the word "extension" in the frontend side will have a very formal meaning. All that to say - I think we should describe this differently than using the word "extension".
- The existing nbextension approach for extending the frontend is going to go away (I am of the opinion that it never should have been publicized). Because of this, we should not provide templates for old style nbextensions. Obviously, we should provide extensions for the new npm style extensions.
- We us creating more and more npm packages, having templates for those would be super helpful. It would be great to sync with @sccolbert and others on the templates that have been using from frontend dev.
A few more comments after looking through the list of possibly templates:
- We should focus initial on 1-3 templates that will have the highest impact and get those really right.
- The follow up questions should be narrowed to only the scope of things actually needed. For example, if someone picks a template that is Python based, there is no need to ask them what package manager they will use. Likewise for JavaScript (npm).
- We should design the templates to ask the user for as little information as possible.
- We shouldn't build templates for completely generic things (just a plain Python package).
@ellisonbg thanks for the input! I'll try to answer some of the pieces below, and reflect on how to update the PR. PRs on PRs welcome!
the word "extension" is very overloaded in the project.
Everything is up for grabs. I think "feature" is a little too specific, as a single development effort would very likely use many features. They aren't entry_points
(not that node/npm has those). They could be "job stories," as picking a particular set of features is really setting you up for a (reduced!) work debt. Open to suggestions! Something Gallilean.
existing nbextension approach for extending the frontend is going to go away ... Obviously, we should provide extensions for the new npm style extensions.
It is known. But without a Q?2016 estimate of when the replacement will ship, folks that have built or want to build and deliver code on/around the incumbent mechanism, have knowledge debt that's not going anywhere in the timeframe where users need it. Being now. And until 5.x is shipped. And ready to deploy internally. Or as part of a bundled software project. And 4.x will be around for a while even after, longer if the deployment burden is noticeably larger.
Let's use this as a way to avoid repeating history, and have something actionable we can argue about concretely.
more and more npm packages,
Huzzah! It's got us covered as a first class citizen. An package.json is probably the metadata format most compatible with the shape of the rest of our data.
1-3 templates that will have the highest impact
Yes, the basic intent is for each set of options to be ordered from "we have to, but we don't have to like it" to "this is a good opinion" up to "this would be awesome... PRs welcome!" Perhaps we can specify a few "opinion bundles"... or better still, "make me something like bqplot"... but that's a huge metadata debt.
On the broader story: I think if it's worth developing, it's worth letting any member project from the ecosystem be able to get involved.
An (atypical) example: nbpresent is a notebook nbextension, backend server extension (to configure the nbconvert service), a couple nbconvert exporters... and I really want to use the gamepad widget as a presenter mouse... and it might even make sense to throw the nbviewer piece in as well, and maybe add a PDF ContentManager, since it can ship PDF with embedded notebooks for archival. Okay, probably not the last thing, but still.
The follow up questions should be narrowed to only the scope of things actually needed
Gah, I thought I made the graph somewhat explicit by using the Ah, I see you have X
. I guess that's the reason I put all those anchors in there, to make an (in)compatibility matrix. My little systems engineering heart is going thump thump.
As to packaging... well, packaging is hell. I would agree with you that The Opinion was PyPi... but evidence suggests that the ecosystem is using conda all over the place. In npm land, I'll agree with you... but npm is only the beginning of that toolchain. So we have to keep digging.
as little information as possible
Yeah. Well, we can try! By keeping the individual sets of options 7±2 (after maybe the first one), the choices shouldn't be overwhelming, and by bounding future questions by the compatibility matrix, there shouldn't be any silly questions by the end.
A notional future GUI for this experience would be even better, especially one that let you interactively see what impact a decision has on the repository you end up with. That might be a killer app for a widgety notebook up on mybinder... run it in two panes, file browser and notebook, every keystroke re-runs the whole thing. Possibilities are endless.
shouldn't build templates for completely generic things
Agreed! We're trying to show developers how to develop with the Jupyter ecosystem, not Just Write Code.
@ellisonbg although we should probably discuss how we are going to transition from nbextensions to the new mechanism in a single code base. I am naturally especially interested into migrating pythreejs and bqplot, where a significant effort was put.
@ellisonbg although we should probably discuss how we are going to transition from nbextensions to the new mechanism in a single code base. I am naturally especially interested into migrating pythreejs and bqplot, where a significant effort was put.
This is actually a really big point... there are a lot of extension out there... and we should put some effort to make them/us suffer as less pain as possible with the transition... I know we don't have an official API, and peple know that we can break things... but there is a lot of work out there that will need some kind "easy" transition pathway and I think we should do the best to offer that if it is possible in the new infrastructure.
@bollwyvl Thanks for putting this together. This would have been immensely helpful when bootstrapping the incubator projects as extensions. I know what we did there is still not ideal and I still don't know any better best practices (but suspect others do!)
I don't know how to contribute here at the moment beyond listing what our incubator extensions currently require like you did with nbpresent. Maybe thinking about a set of existing extensions both in terms of what templates they would use if started today and what templates they would need when being ported to the new "workbench" pluggable base will help guide decision making about the yeoman options.
Here's two. @lbustelo can cover the declarative widgets.
jupyter-incubator/dashboards
Currently includes both:
- Frontend JS for adding toolbars, menu items, layout mode to the notebook UI
- Backend notebook server tornado handler for handling dashboard bundling and deployment
If picking from the template list currently in the pull request and assuming the templates are for Jupyter 4.x as it exists today, I guess it would be refactored into JavaScript Notebook (nbextension) for the frontend and Server Route (notebook) for the backend handler bit.
In Jupyter 5.0+, I think (maybe incorrectly) dashboarding is going to become part of the workbench UI rather than something you do to a notebook. So there might not be any porting of this work there. But if I still wanted to plug-in the current idea of laying out a notebook as a dashboard and deploying it as a separate app, I'm guessing I'd still go with the JavaScript Notebook (nbextension) for the frontend and Server Route (notebook) options (but would expect a Phosphor plugin base for the former instead of the legacy jQuery based one above).
jupyter-incubator/contentmanagement
Currently includes:
- Frontend JS for new search hot keys, actions, menu items
- IPython kernel extension that adds a magic for snippet insertion and enables importable notebooks
- Backend notebook server tornado handler for performing searches with whoosh and returning results
Like the above, I guess these would have to be refactored into three separate extension templates that would somehow get installed together and work together:
- JavaScript Notebook (nbextension) for the frontend bits
- Server Route (notebook) for the backend bits supporting the frontend bits
- UNKNOWN for the kernel extension: it's a line magic but also more
Perhaps it would also be useful to describe what nbgrader includes. nbgrader itself started out as an extension building on top of nbconvert, but I think it's complex enough at this point that it can be considered as just a standalone thing. It does, however, include two extensions to the notebook interface:
- Create Assignment extension
- Frontend JS for the "create assignment" cell toolbar
- Assignment List extension
- Frontend JS for the notebook list that adds an Assignment tab
- Backend notebook server tornado handlers for running the various nbgrader commands
And I have been working on another extension as well that allows the formgrader to be accessed through the notebook server by hooking up it's tornado handlers to the notebook.
Great feedback, all!
For a developer, making the architectural choice of "what do i need to do" is a huge step to starting, and real examples of useful software are better than a menu of templates for getting the lay of the land. Indeed it would be helpful to have canonical examples for each extension point in creating the template fragments, but presenting links to the user for reference will be an important documentation step.
In addition to the first choice of "what", no doubt other choices were made as to "how" each of these things became software... I'll certainly take a look at how those opinions shake down across the projects listed.
Re: 5.0 changes: i moved versions to support higher in the list, and likely all the extension points should be labeled. A really valuable piece will be to target and test against both the old and new apis. If everything has to be redone clean slate, we've sacrificed a lot of contributorhours in the name of... Something. Not sure what that will be yet. If we can provide a structured path forward, we may be able to salvage some good will.
On 13:33, Wed, Dec 2, 2015 Jessica B. Hamrick [email protected] wrote:
Perhaps it would also be useful to describe what nbgrader includes. nbgrader itself started out as an extension building on top of nbconvert, but I think it's complex enough at this point that it can be considered as just a standalone thing. It does, however, include two extensions to the notebook interface:
- Create Assignment extension
- Frontend JS for the "create assignment" cell toolbar
- Assignment List extension
- Frontend JS for the notebook list that adds an Assignment tab
- Backend notebook server tornado handlers for running the various nbgrader commands
And I have been working on another extension as well that allows the formgrader to be accessed through the notebook server by hooking up it's tornado handlers to the notebook.
— Reply to this email directly or view it on GitHub https://github.com/jupyter/enhancement-proposals/pull/7#issuecomment-161391657 .
jupyter-incubator/declarativewidgets
Currently includes:
Frontend JS for loading core set of elements on the page IPython kernel extension for python to add the Python side of the elements IPython kernel extension for spark-kernel to add the Scala side of the elements Backend notebook server tornado handler for bower installing new elements at runtime
Started some of my exploration of things mentioned in this spreadsheet... sorry for the google doc, but i found markdown not the right tool for the job. Once I feel better about it, i'll add it to the PR as a CSV... or dump to markdown... but until then... anybody who'd like to slog through some data entry, and either fix what I did wrong or add other key things ping me up on gitter :) I'd like to add a ContentManager (ipymd, or jupyter-drive), some kernels, etc. but really anything anyone would like to volunteer would be greatly appreciated.
I would consider the spreadsheet to be the canonical list of features, etc (vs the yeoman screenplay). I think, properly matured, this could be the beginning of a stub for the eventual repo. Exciting stuff!
A few questions and comments about Python/Server side:
- Anything that is an extension on the server/python side is just an importable Python package right?
- I don't think it is worth our effort, at least initially, to create yeoman templates for just plain Python packages. Our Python APIs are much more stable and the pain of creating Python packages is relatively small.
@bollwyvl I had a look at the spreadsheet and it isn't really obvious what information it is encoding. What are the rows? columns? Can you describe a bit more what you are after there?
On the JavaScript/CSS/HTML side, I don't yet know if it will be possible for a single code base to support both the existing (4.x) and new (5.x and beyond) frontend APIs. My initial thought that is it will be so different that it is not going to be possible. I don't think anything will be the same (DOM structure, CSS classes, packaging, loading, JS APIs, etc.).
In terms of transition, I think yeoman templates, demo extensions, and documentation are going to be how folks get going on writing to the new APIs. We do want to make that as easy as possible for sure.
Anything that is an extension on the server/python side is just an importable Python package right?
in theory, it's just a module that has load_jupyter_server_extension
... but...
I don't think it is worth our effort, at least initially, to create yeoman templates for just plain Python packages. Our Python APIs are much more stable and the pain of creating Python packages is relatively small.
In practice, the pain of packaging for the myriad ways a user might want to install/configure the developer's extension is not relatively small. Each of the repositories listed has its own, separate way of dealing with this, as there isn't a standard way i.e. entry_points
. Combining this with server/user/env/custom config dir, it's not a nice place to be. Hence, I think stubbing out these things absolutely is critical.
I had a look at the spreadsheet and it isn't really obvious what information it is encoding. What are the rows? columns? Can you describe a bit more what you are after there?
Each row is a feature that someone might want to include in their product, or the process they can use to create that product.
Category
is a high level name for the kind of thing we're talking about.
Name
is the human-readable name.
Module
is where the business logic and templates would be stored in the generator repository. This will ensure that one would be able to read the meaning of a pull request very easily from the list of files introduced.
Upstream API
is the upstream repo that would need to be tracked for changes, informing when the generator would have to be updated.
Each In the Wild
column is a specific repo mentioned above. TRUE
indicates that the feature is used by that project. Long text values are TRUE
, but with some caveat. The goal would be to find example repositories for all of the features (or trim features that were just my brainstorming) to a) be linked to directly in documentation and b) serve as starting points for templates.
In this section i describe how features can demand other features, and then the features may or may not create files. Filling out this matrix will be involved, so I wanted to do some validation of things I thought I had read/seen.
For example, saying you want an extension-point/notebook-ui-js
wouldn't immediately create an index.js
. Instead, it would demand a feature that provides an index.js
, such as js-dialect/es5/
or js-dialect/es2015/
. But if you pick js-dialect/es2015/
, then you'll need a runner
. This continues until it reaches a steady state and no more features are required. THEN you can actually make files.
On the JavaScript/CSS/HTML side, I don't yet know if it will be possible for a single code base to support both the existing (4.x) and new (5.x and beyond) frontend APIs.
Can it be a design goal? It doesn't have to be free: If the answer is "start using npm and stealjs right now, and browserify" that's fine, but again, at least for a while maintainers are going to have to support both. On the css, start giving us some namespaces: body.jupyter-v4
. For a number of extensions, the challenge is really the integration with the external library, so we don't need to make it any more challenging.
The 4.x version won't be going away overnight. The workbench will live in a separate browser tab alongside the old version for a while. Would this be sufficient for easing the transition?
We've been working with @bollwyvl @damianavila and others at Continuum to create a small tool-chain to support easy-installation and encapsulation of Jupyter 4.x extensions, along with a few extensions built using the tool-chain and a few existing ones ported over. My outline of what we've done:
- generator to stub-out a basic extension with a minimal set of Python and JS code in a simple directory/file organization (this builds directly on the suggestions in this JEP)
- nbsetuptools to encapsulate the standard mechanics for installing, enabling, disabling an extension (a Python module)
-
nbwrapper to setup a Jupyter installation to have extensions tied to the Jupyter code-base, not to the user's
~/.jupyter
directory. It creates some base directories and files (insidesite-packages
) and also provides a wrapper scriptanaconda-notebook
that simply setsJUPYTER_CONFIG_DIR
andJUPYTER_DATA_DIR
based on the current conda environment path, thus allowing extensions to be installed with conda and encapsulated into a conda environment.
If you dig around the repos for all of these are already public but the content is still WIP and lacks documentation or testing. But our next sprint goal is to get it to a place where we can do a limited circulation for "friendly" initial review and feedback, with a plan to have some blog posts and perhaps a webinar in January/February 2016 about the power of Notebooks for technical computing and (unsurprisingly) the benefits of integrating them with conda
and the Anaconda ecosystem. We'll also spend some time on Jupyter extensions and the "streamlined" extension authoring mechanism we've developed (specifically for our own extension creation, but we are making the tooling all open source). And yes @sccolbert (and @bollwyvl and @damianavila ) have been keeping us up to date with how 5.x completely reworks the concept of "extension".
@ijstokes @bollwyvl @damianavila I think there are interesting directions here, but other aspects of what you are proposing that will increase the net pain of users of Jupyter and anaconda.
The * generator* stuff could be very helpful to developers who want to write extensions. I think everyone's time at this point would be better spent working on the new 5.0 stuff, but I understand you have different goals that are served by this work.
On the side of nbsetuptools and nbwrapper I think it would be a technical mistake to pursue the approach you describe. Here is why:
- The 4.x extension system (the packaging and distribution side of it) has fundamental problems at its core.
- You are proposing to duct tape over those problems by essentially forking the top level of the notebook, wedding yourself to these problems, and providing a solution that only works in anaconda.
- Users will be frustrated by things that only work on anaconda. While anaconda is great, it is not the only way that people install and use Jupyter.
- Developers of extensions will be frustrated by having to support our existing approach, whatever you put together....and also the new approach we are building for 5.0. That isn't less pain, but more.
- When we completely remove the 4.x nbextension abstraction, your entire approach will completely stop working. You will faced with a painful choice of abandoning users/devs or further forking the notebook.
After talking with a few people, I think there is a way of addressing the packaging/installation challenges for 4.x nbextensions that doesn't have these problems.
I propose:
- For 5.0, we completely remove the existing way that we load 4.x nbextensions.
- We allow 4.x nbextensions to be packaged and distributed as npm modules. This will be simple as we will already be doing that for the entire 5.0 code base.
- We build a toolchain in Jupyter that is npm based for installing both 4.x extensions and 5.0 JS code.
- Anaconda can wrap that npm toolchain to provide a single entry point for both 4.x extensions and 5.0 JS.
- We change the existing notebook pages to load 4.x nbextensions from node_modules, but still be able to target the 4.x JavaScript APIs (only on the existing notebook pages though, not the workbench/lab).
We are already basically pursuing this approach for 5.0. The only new thing I am proposing is that we enable 4.x nbextensions to be managed in the same exact way (npm all the things!).
Benefits:
- Users, packagers and developers will have a single approach for managing JS code related to Jupyter 5.0.
- Extension authors only have to worry about 1 thing - shipping an npm package - regardless of whether they want to target our 4.x or 5.0 JS APIs.
- In some cases, it may be possible for a single npm package to target both APIs by having code paths for both.
- It will be much easier to maintain the generator code for both 4.x nbextensions and 5.0 as it will also just be npm based templates.
Some technical notes:
- Given how npm works, there will need to be a single
node_modules
for each notebook installation (env). - The Jupyter Notebook will ship with a default set of packages in that
node_modules
so the basic notebook can be pip installed without npm/node present. - The only way to install additional packages into that
node_modules
will be to use npm. Any additional tools provided by Jupyter or Anaconda for managing these packages will necessarily call out to npm. It would be crazy to try and replace npm for these purposes.
@ellisonbg thanks for the quick reply. @damianavila understands the details of what we've done and historical, present, and future situation for Jupyter better than I do, so he said he plans to give a more detailed response, but from a high level I have a few comments and follow up questions:
At the highest level @bollwyvl wrote this JEP about yeoman
based templates to automate the construction of Jupyter extensions. It sounds like there is general agreement that such a thing would be valuable, and that your main suggestion is that any implementation should focus on node-based package distribution in a way that can simultaneously be used for 4.x style extensions and 5.x style extensions. Perhaps even supporting both kinds of extension in the same module.
I think a lot of the rest of the discussion is a critique that goes beyond the JEP because I (probably mistakenly) ended up sharing the larger context of the work we're doing at Continuum which motivated this JEP around template-based extension creation. Be that as it may, the discussion is started and I do have some responses to the first half (spoiler: the second half of @ellisonbg critique, which outlines an NPM-centric packaging strategy, seems reasonable and we'd be happy to consider trying this in our implementation).
On to responses to the first half:
other aspects of what you are proposing that will increase the net pain of users of Jupyter and anaconda.
That would be a bad outcome. We definitely want to avoid net pain. So it is a high priority for me to understand if that is really what we're setting ourselves up for. Sharing our plans here at an early stage is part of trying to get that feedback so we don't do something counterproductive.
I think everyone's time at this point would be better spent working on the new 5.0 stuff
We had hoped we were going to be able to do that, but I think 5.0 is maybe 3 months beyond what we need to support the user community we're working with right now. So we needed to harden and standardize a set of existing extensions.
Furthermore, I think there is definitely some benefit to having a generator in place that we can talk about with a "known entity" which are the 4.x style extensions, rather than co-mingle the new 5.x extension format and attempts to create generators for that format. I think there would be a lot of debate about things that had nothing to do with the generator and instead were tied to people just shaking out opinions about how best to construct 5.x extensions.
You are proposing to duct tape over those problems by essentially forking the top level of the notebook, wedding yourself to these problems, and providing a solution that only works in anaconda.
I'm sorry for misleading anyone into thinking there was a fork. There is no fork.
The solution we've developed does not just work with Anaconda, but getting it working with Anaconda with as little pain as possible is our primary goal.
The 4.x extension system (the packaging and distribution side of it) has fundamental problems at its core.
Yes, we're very aware of that and have been dealing in an ad hoc way with those problems for the past 1+ years. That is why at this point we had hoped to invest time only in 5.x style extensions but given the timelines for 5.x and the relatively dramatic shift in foundation that will come with it we realized that we had a user base we will need to support through much of 2016 with 4.x style extensions, and those existing extensions we had implemented needed to be standardized and improved. We are not in a position to say either "You have to shift to 5.0" or to say "We aren't supporting Jupyter 4.x any more". At least not until the second half of 2016.
Users will be frustrated by things that only work on anaconda. While anaconda is great, it is not the only way that people install and use Jupyter.
They will not only work with Anaconda. They will be "equivalently awkward" to install outside of Anaconda as any other existing Jupyter extension is. But with Anaconda they will be easy to install with simply conda install extensionname
. In fact, conda packages for these extensions were the last thing we created. For all our development process we had scripted awkward mechanisms to copy all the files into the right locations and modify the configuration files.
Developers of extensions will be frustrated by having to support our existing approach, whatever you put together....and also the new approach we are building for 5.0. That isn't less pain, but more.
This is less pain for us right now, and will be for the next 3-6 months. And once 5.0 is out I know it will be less pain for us to use our existing tools with our maintenance of the old 4.x extensions, regardless of what processes and tooling is available for creating/porting/maintaining 5.x extensions. (our hope is that we can learn from our existing "standardization" and "tooling" of the 4.x extensions so that we can contribute to something even better and more informed to facilitate creation of 5.x extensions/plugins).
BUT I do take your point and we have been discussing exactly the question of whether we should just keep these "quiet" and only promote the tools once 5.0 is out and we have adapted them in some way to the 5.x world, with 4.x support only available as a historical footnote.
When we completely remove the 4.x nbextension abstraction, your entire approach will completely stop working.
We only expect the current implementation to work with Jupyter 4.x. When Jupyter 5.0 is released we would hope to figure out what the logical transition is in the tool chain to support the creation of 5.x extensions. I don't quite understand this criticism: most (all?) 4.x extensions will stop working in 4.x, I would imagine.
We allow 4.x nbextensions to be packaged and distributed as npm modules. This will be simple as we will already be doing that for the entire 5.0 code base.
Great. We'd be happy to help create these npm modules from the tool chain as it evolves. Anaconda Cloud, incidentally, already supports npm modules and is an NPM package mirror: https://npm.anaconda.org/ (I share this just to make it clear that we are not conda-package-zealots). I don't know enough about npm to know what advantages it provides over Wheels or Conda packages for software distribution: mostly my experience with NPM is frustration and confusion, I have to admit, but I accept that npm and JS aren't going anywhere any time soon.
I appreciate the insights into what is planned for 5.0 extensions and distribution of those extensions. We'll definitely consider your suggestions to evolve the tools we've written into something which is more "npm-package-centric".
The situation we're in is that we've developed about half a dozen extensions that either have been in use for the past ~year, or recently have been developed to facilitate real user requirements. We need these extensions now, and the users of these extensions anticipate using Jupyter 4 "style" notebooks through 2016. We weren't happy with the ad hoc mechanisms we'd been using to construct extensions so we invested ~1-2 person weeks of effort to identify the common parts, create abstractions to capture those, refactor the existing extensions to use the common libraries, and then think about what we needed to do to make it easier to write (or re-write) the next half dozen extensions we needed to get through. For that last step we decided to write a generator off of some template files. Creating conda recipes for these (and auto-generating conda recipes from templates) allowed an extension to be installed simply with conda install foo_extension
rather than having to go through multiple steps including config file editing.
Finally we needed a way to encapsulate extensions so they were only installed in individual conda environments rather than "per-user". Basically not much more than a script that did:
JUPYTER_CONFIG_DIR=$CONDA_ENV_PATH/etc/jupyter \
JUPYTER_DATA_DIR=$CONDA_ENV_PATH/share/jupyter \
jupyter notebook
But we hope to figure out a way (probably necessitating a Jupyter PR) that will allow us to get rid of this wrapper script and have Jupyter search all the directories in jupyter --paths
for extensions, rather than just the first (or the one pointed at by the environment variables).
Brian, several clarifications below:
but other aspects of what you are proposing that will increase the net pain of users of Jupyter and anaconda.
We are just trying to avoid that :wink:
The * generator* stuff could be very helpful to developers who want to write extensions. I think everyone's time at this point would be better spent working on the new 5.0 stuff, but I understand you have different goals that are served by this work.
We are just trying to have a solution for some very painful issues involving nbextension, server-extensions and nbserver-extensions (I mean, a server-extension + a client side component)
The 4.x extension system (the packaging and distribution side of it) has fundamental problems at its core. You are proposing to duct tape over those problems by essentially forking the top level of the notebook, wedding yourself to these problems, and providing a solution that only works in anaconda.
Let me be clear here... we are not forking the notebook, the notebook wrapper is just a short-term wrapper launching a notebook instance with the jupyter_config_dir
pointing to and etc/jupyter living in prefix (conda environment). Right now, the notebook only looks for nbconfig stuff the the user space (aka ./jupyter
)... so we need this wrapper just to read the config from other places... we can completely remove the "nbwrapper" when the notebook is able to read config from other places (prefix or system), so again... we are not forking the notebook and we don't pretend to do that...
Users will be frustrated by things that only work on anaconda. While anaconda is great, it is not the only way that people install and use Jupyter.
I agree that right now the work we have done so far was centered around the conda/anaconda ecosystem, but the plan is to release some of these extensions (including nbsetuptools) as pypi packages, so people can eventually use it outside conda as well...
Developers of extensions will be frustrated by having to support our existing approach, whatever you put together....and also the new approach we are building for 5.0. That isn't less pain, but more.
In fact, and now talking as a jupyter dev, we don't even have "our" approach... we only provide some mechanism to install the nbextensions, but we don't have anything for server-based extension with a JS conterpart (which is a difficult problem to solve because you need to install python and JS and enable/register the JS side). We only pretend to fill the hole with a solution to help user to use the 4.x series. I do not even pretend to extend the current solution for the future unless we can couple npm inside the nbsetuptools concept (as you said in the your proposal, about "npm all things"... I think that resolves the nbextension - js only - part, but the server extension with a JS component is a most complex problem).
When we completely remove the 4.x nbextension abstraction, your entire approach will completely stop working. You will faced with a painful choice of abandoning users/devs or further forking the notebook.
Again, we are not forking the notebook, we are not going in that direction (I would not participate in something like that). We certainly need a kind of transition from the propose we are working on now and the next architecture, but at least people have something now to work with... if it not the case, they have to hit the wall until they come up with something workable and in that case, they will have to transition too... or even worst, they will have to wait until 5.x... that's something we should avoid because we miss traction in this case.
I propose: For 5.0, we completely remove the existing way that we load 4.x nbextensions.
OK...
We allow 4.x nbextensions to be packaged and distributed as npm modules. This will be simple as we will already be doing that for the entire 5.0 code base.
Sounds good...
We build a toolchain in Jupyter that is npm based for installing both 4.x extensions and 5.0 JS code.
Thas's great...
Anaconda can wrap that npm toolchain to provide a single entry point for both 4.x extensions and 5.0 JS.
That could be probably achieved by our nbsetuptools proposal... The idea is to get it right now, so that when 5.0 comes, it helps with this...
We change the existing notebook pages to load 4.x nbextensions from node_modules, but still be able to target the 4.x JavaScript APIs (only on the existing notebook pages though, not the workbench/lab).
I think this is pretty important to give user/devs enough time to transition...
We are already basically pursuing this approach for 5.0. The only new thing I am proposing is that we enable 4.x nbextensions to be managed in the same exact way (npm all the things!).
In general, I almost completely agree with you view... we are just trying to give a solution for existing devs/users which are currently lost in the middle of a sea of confusion with multiple partial solutions...
Benefits: Users, packagers and developers will have a single approach for managing JS code related to Jupyter 5.0. Extension authors only have to worry about 1 thing - shipping an npm package - regardless of whether they want to target our 4.x or 5.0 JS APIs. In some cases, it may be possible for a single npm package to target both APIs by having code paths for both.
:+1:
It will be much easier to maintain the generator code for both 4.x nbextensions and 5.0 as it will also just be npm based templates.
I agree on this...
Some technical notes: Given how npm works, there will need to be a single node_modules for each notebook installation (env).
Each env could have a kernelspec pointing to the python executable, so we need just to have node_modules in the env where we trigger the notebook server...
The Jupyter Notebook will ship with a default set of packages in that node_modules so the basic notebook can be pip installed without npm/node present.
OK...
The only way to install additional packages into that node_modules will be to use npm. Any additional tools provided by Jupyter or Anaconda for managing these packages will necessarily call out to npm. It would be crazy to try and replace npm for these purposes.
Yep, I agree...
Just to summarize, and repeating myself, we just pretend to ship a simple solution until we evolve the package system into 5.x. I think this is pretty important to save frustration and help user/devs to work with the current release without having to wait a lot of time to see how the architecture is developed...
@damianavila and @ijstokes thanks, this really helps and we can make some progress.
It definitely helps to understand your usage case a bit better. Let me restate it and see if I have it right:
- You want to improve the situation for the development, installation and activation of 4.x style extensions, running on 4.x releases of jupyter/notebook.
- In this case, 5.0 and beyond is a separate question and you are on board with using whatever npm based approach we come up with.
I full agree that the situation with the 4.x codebase is too painful (understatement).
Let's try to fix that in 4.x releases in a backwards compatible manner that helps existing and new 4.x nbextensions to "just work" with the latest 4.x release of the notebook. I think we can do that.
Thanks for clarifying your current approach, I understand much better what you are doing. Some comments though:
- While not a true fork (in the version control sense) I still think you are building a "soft fork" from the users perspective: a user will have to type
anaconda-notebook
rather thanjupyter notebook
and the app they see will be different. - If you start the notebook by setting
JUPYTER_CONFIG_DIR
andJUPYTER_DATA_DIR
, none of the user's other jupyter stuff with work. - Existing packages with nbextensions won't get any benefit from this at all. In fact, it is worse. Let's say I want to install
nbgrader
into a conda env and use it with the other extensions you have built. I would do:
pip install nbgrader # this will install into my active conda env
nbgrader extension install # this installs it into standard Jupyter locations
nbgrader extension activate
anaconda-notebook # this disables those standard Jupyter locations
- No nbextensions installed into the standard Jupyter locations will show up and work in the
anaconda-notebook
. - Because you are inventing a new package format that has weird, nested Python packages, you force existing nbextension authors into a really difficult situation where they won't want to use your new stuff because it is equally horrible from a developers perspective, but helps users,...
- but only those using anaconda.
A separate post, looking forward...
What is the minimal set of changes we could make to Jupyter 4.x to ease user/dev pain for developing, packaging and installing 4.x nbextensions, but only with latest 4.x stable releases (not 5.0)?
Can we do this in a backwards compatible way so we can release it quickly in a 4.2 release?
Can we make provide hooks/APIs for Jupyter that allows it to better play with conda and other environments?
It would be awesome if the answer to @ellisonbg last question is yes so that a new release could include whatever changes we think are necessary to support these extensions in a way that works well with other extensions.
Can we make provide hooks/APIs for Jupyter that allows it to better play with conda and other environments?
We specifically decided against adding hooks to Jupyter to be aware of envs, at least as they pertain to kernels, since doing so would mean that we should support at least virtualenv, conda, rbenv, nodenv, and every language-specific env system there is.
Right now, what we already do is support installation of extensions, specs, and the like in envs, by ensuring that we look at sys.prefix
as a resource location. There is some ongoing discussion over whether the env should be the default installation location, which would make the default experience better for certain conda use cases (worse for kernels in different envs, better for isolated full-stacks with one Python, R, etc.).
@ellisonbg @minrk
I think this is critical:
Let me be clear here... we are not forking the notebook, the notebook wrapper is just a short-term wrapper launching a notebook instance with the jupyter_config_dir pointing to and etc/jupyter living in prefix (conda environment). Right now, the notebook only looks for nbconfig stuff the the user space (aka ./jupyter)... so we need this wrapper just to read the config from other places... we can completely remove the "nbwrapper" when the notebook is able to read config from other places (prefix or system), so again...
If we solve that, then we can probably eliminated 95%-100% of the issues @ellisonbg raised in his comments.
We briefly discussed about this here: https://github.com/jupyter/notebook/issues/331
I think that some way to "intelligent" read nbconfig from user, prefix and system and some way to "integrate" those configs (establishing a hierarchy, figuring out override policies, etc) will let us ship something working for make easy the 4.x nbextension/serverextsnion installation and registering, that could be used with conda, but also outside the conda/anaconda ecosystem, and make everyone happy :wink:
Can we make provide hooks/APIs for Jupyter that allows it to better play with conda and other environments?
You might like https://github.com/Cadair/jupyter_environment_kernels/ -> all conda envs (after PR7 in that repo is merged) and all virtualenv envs are visible as kernels.
Min, thanks for the clarification. I had missed we already have sys.prefix
based paths.
For clarification, here are the paths that show up on a 4.x conda installed jupyter:
In [12]: jupyter_path('nbextension')
Out[12]:
['/Users/bgranger/Library/Jupyter/nbextension',
'/Users/bgranger/anaconda/envs/python34/share/jupyter/nbextension',
'/usr/local/share/jupyter/nbextension',
'/usr/share/jupyter/nbextension']
In [8]: jupyter_config_path()
Out[8]:
['/Users/bgranger/.jupyter',
'/Users/bgranger/anaconda/envs/python34/etc/jupyter',
'/usr/local/etc/jupyter',
'/etc/jupyter']
The key entries in both of these are /Users/bgranger/anaconda/envs/python34/etc/jupyter
and /Users/bgranger/anaconda/envs/python34/share/jupyter/nbextension
.
This issue of "Env aware locations" is one aspect that we need to figure out. But it sounds like that is already solved. @ijstokes @damianavila @bollwyvl can you investigate to see if you can install things there so you don't have to write a wrapper script that changes the env var based paths?
@ellisonbg we are in fact installing in prefix/share/jupyter
and prefix/etc/jupyter
... the problem is that the notebook can "detect" nbextension installed in the jupyter_data_dir but it can not read the nbconfig from other jupyter_config_dir than ./jupyter
... check this comment from @minrk: https://github.com/jupyter/notebook/issues/331#issuecomment-140717565
I am planning to work on this issue in the coming days @minrk @takluyver inputs on this will be very welcomed... I would like to come up with a discussed implementation to support this use case...
@JanSchulz, I guess Brian was referring about some kind of API in the core, but as Min said:
We specifically decided against adding hooks to Jupyter to be aware of envs, at least as they pertain to kernels, since doing so would mean that we should support at least virtualenv, conda, rbenv, nodenv, and every language-specific env system there is.
Btw, I ended up writing a CondaKernelSpecManager which is mainly doing what Stuart did but only for conda envs... but that's another discussion, so I will comment on that project later...
You might like https://github.com/Cadair/jupyter_environment_kernels/ -> all conda envs (after PR7 in that repo is merged) and all virtualenv envs are visible as kernels.
@damianavila I think that's deliberate-ish. As I see it, we support installing nbextensions inside an environment - so you could e.g. make a conda package of livereveal - but they have to be enabled/disabled and configured at the user level. Of course, you could use post-install and pre-remove scripts to automatically enable extensions on install and disable them on uninstall. This also works for e.g. apt packages: an extension could be installed systemwide, and individual users could decide whether to enable it.
It's very much deliberate that frontend config is stored in one place, because:
- I designed the frontend config system primarily as a way to persist config values chosen by the user within the UI. In this context, there's no need to have multiple locations for the same config. Modifying the config externally, as in
jupyter nbextension enable
, is a good thing, but I still want to keep it to one location. - If the same config can be stored in multiple locations, you have to deal with merging them. This is potentially confusing for users (why is this config value not having any effect? because it's overridden somewhere else), and it's tricky when your values can themselves be collections (do you replace the earlier collection entirely, or try to combine them?). The problem gets worse when your config is machine-written, because then you have to track where each value came from, so you can only write back the relevant ones. It's much simpler if there's only one place where config lives.
Ahh, I was getting confused myself about the situation.
The line that causes this problem is here:
https://github.com/jupyter/notebook/blob/master/notebook/services/config/manager.py#L17
Basically, even though the main notebook app knows about all of the
different config paths, the main nbconfig manager that serves things to the
frontend only uses jupyter_confir_dir()
which is just the users
.jupyter
location.
On Tue, Dec 15, 2015 at 9:33 AM, Damián Avila [email protected] wrote:
@JanSchulz https://github.com/JanSchulz, I guess Brian was referring about some kind of API in the core, but as Min said:
We specifically decided against adding hooks to Jupyter to be aware of envs, at least as they pertain to kernels, since doing so would mean that we should support at least virtualenv, conda, rbenv, nodenv, and every language-specific env system there is.
Btw, I ended up writing a CondaKernelSpecManager which is mainly doing what Stuart did but only for conda envs... but that's another discussion, so I will comment on that project later...
You might like https://github.com/Cadair/jupyter_environment_kernels/ -> all conda envs (after PR7 in that repo is merged) and all virtualenv envs are visible as kernels.
— Reply to this email directly or view it on GitHub https://github.com/jupyter/enhancement-proposals/pull/7#issuecomment-164833826 .
Brian E. Granger Associate Professor of Physics and Data Science Cal Poly State University, San Luis Obispo @ellisonbg on Twitter and GitHub [email protected] and [email protected]