notebook icon indicating copy to clipboard operation
notebook copied to clipboard

Allow Multiple ContentsManager

Open mlucool opened this issue 7 years ago • 5 comments

Can we make it possible to allow for multiple content managers in notebook server?

In a complex environment there may be a more than one type of notebook source so it is better to compose them from the right managers. For example, some notebooks may be from disk, others from a database, while others that purely are read only and exist in memory. Which content manager you call should be driven by the path (/file/full/path/to/file, /db/table/entry).

I was able to write something that works as a POC for this by taking a ContextManager and then dispatching all calls to a list of internally managed ContextMangers. It simply goes in order of those managers and calls an additional function on each, manages_path(path), which returns true if it wants to manage this path. It then passes on the call to this manager and returns its output. This is crude and would be better if there was a fully supported way to make this happen as I imagine there will likely be similar issues with things like the FileCheckpoints etc.

Would this be something this project would be open to taking up?

mlucool avatar Jan 18 '18 00:01 mlucool

@Carreau and @KesterTong already built something like this called MixedContentsManager as part of a project to integrate Jupyter with Google Drive. It hasn't been touched for a while, but maybe there's something useful to work with there.

From the notebook's point of view, I think it's fine to do this entirely within the ContentsManager, which is already pluggable. So I'd suggest building it as a separate thing for now, and then proposing it for inclusion in notebook once it's working nicely.

takluyver avatar Jan 18 '18 10:01 takluyver

Yep there was some plan to make the mixed content manager the default, and just allow to add/remove extra content managers. @ian-r-rose may have also worked on that with his github and google drive plugins for JupyterLab.

Carreau avatar Jan 18 '18 14:01 Carreau

Yes, I have worked on a client-side approach to this in JupyterLab, which dispatches requests to different storage backends, each of which implements the Contents API. This is currently used for the JupyterLab Google Drive and GitHub plugins.

ian-r-rose avatar Jan 18 '18 16:01 ian-r-rose

This code is a WIP but does work:

A ContentsManager that passes to other ContentsManagers who also have a manages_path function:

from notebook.services.contents.filemanager import ContentsManager

def make_dispatched(self, function):
    """
    This function inspects a function and dispatches it out by finding which
    arg is the path, then finding a manager to manage that path and forwarding
    this on
    """
    # functions passed here have self in it, when it's called its bound so -1 for no self
    path_idx = inspect.getargspec(function)[0].index('path') - 1

    def wrapper(*args, **kwargs):
        path = kwargs['path'] if 'path' in kwargs else args[path_idx]
        for manager in self.managers:
            if manager.manages_path(path):
                fn = getattr(manager, function.__name__)
                return fn(*args, **kwargs)
        # This self is jupyter's
        self.log.error("Could not find manager for path %s", path)
        raise Exception('Could not find manager for path "%s"' % path)

    return wrapper


_managers = None
def set_managers(managers):
    """
    We need a global list because I dont see where to pass this data to juptyer->manager
    """
    global _managers
    _managers = managers


class DispatchedContentsManager(ContentsManager):
    def __init__(self, **kwargs):
        global _managers
        self.managers = []
        for manager in _managers:
            # Init each manger with the same params
            inst = manager(**kwargs)
            if not hasattr(inst, 'manages_path'):
                raise RuntimeError('Count not find method manages_path in manager!')
            self.managers.append(inst)

        # FIXME: add support for rename_file! This does not follow the pattern below
        for name, method in inspect.getmembers(self,
                                               predicate=inspect.ismethod):
            if (not name.startswith('_') and
                        'path' in inspect.getargspec(method)[0]):
                setattr(self, name, make_dispatched(self, method))

Example manager (typically manages_path has a regexp in it):

from notebook.services.contents.filemanager import FileContentsManager

class DefaultFileManager(FileContentsManager):
    """
    A fallback manager to be used for all files
    """

    def manages_path(self, path):
        return True

Now in a jupyter config:

set_managers([DefaultFileManager]) # Pass in the class

Any suggestions?

mlucool avatar Jan 18 '18 18:01 mlucool

This seems to be a ready made solution: https://github.com/viaduct-ai/hybridcontents

jessie-ross avatar Mar 12 '24 09:03 jessie-ross