synapse Identify, implement, and document a way to extend our Docker images

Identify, implement, and document a way to extend our Docker images

Open callahad opened this issue 3 years ago • 5 comments

In the process of splitting the Account Validity feature into its own module we've run up against a roadblock: we don't have a well-defined path for people to add extension modules to our Docker images.

We need to survey prior art, pick a solution, and then implement and document it.

May 06 '21 15:05 callahad

Jenkins suggests building derived images: https://github.com/jenkinsci/docker/blob/jenkins-docker-packaging-2.235.1/README.md#installing-more-tools. They define a location for you to dump files (/usr/share/jenkins/ref/) and also have an install-plugins.sh which can take, effectively, a requirements.txt of extra things to install at container build time.

May 06 '21 15:05 callahad

As alluded to in https://github.com/matrix-org/synapse/issues/6660#issuecomment-738734852 and again mentioned in #synapse-dev, @clokep noted that we could define a mountpoint for a directory outside the container which maps into the container's PYTHONPATH. This way a custom docker image would not need to be maintained in order to install modules.

Feb 28 '22 18:02 anoadragon453

Spitballing after some discussion with @reivilibre some time ago and @DMRobertson just now:

There's an argument to be made about grouping all of the Synapse modules under matrix-org into a monorepo; mostly that it would improve maintainability (in that we can share and update CI more easily, we'd run CI more often on all modules, and we wouldn't need to maintain a list of every module we maintain just so we can remember them). If we do this, it might also solve the issue with using modules with Docker to some extent, in that we could automatically build and publish a Docker image built on top of the Synapse one including all of these modules (and users could then configure Synapse to use any module in that list).

It's nice in that it prevents the user from having to (re)build their own Docker image when they want to use our modules (or to update them, or Synapse), but it also has a couple drawbacks:

users will always have to use the latest version of each module at the time the tag they're using is created; it wouldn't be possible to use both an old version of module A and the latest version of module B
users wishing to use modules we don't maintain would still need to build their own custom image

I think saying that if someone is in one of these cases then it's probably too custom for us to support (especially the second one) and they should build their own image would be appropriate, but maybe I'm underestimating how much of a hurdle building and maintaining a custom image for your specific deployment is.

Some alternatives:

we document extensively how users can build their own docker images based off matrixdotorg/synapse with whichever modules they want and add some build args to facilitate thing, and let them build it themselves (but that requires them to maintain their own images, which I think we'd like to avoid to some extent as it adds some overhead to updating Synapse)
we add a --with-module arg to the docker run entry point of the Synapse image and install the modules when starting the container (but this sounds a bit hacky to me, possibly messes with the poetry stuff by bypassing it, and means that we'd need to reinstall the modules at each restart unless we start using volumes for that which might get a bit iffy)

May 04 '22 13:05 babolivier

There's an argument to be made about grouping all of the Synapse modules under matrix-org into a monorepo

I think this negates one of the main reasons that a module API exists: it allows people other than the Synapse development team to extend Synapse.

May 04 '22 13:05 richvdh

I am neigher a Python expert nor do I know how the module loading system works exactly. But wouldn't it be enough to allow an administrator to define an additional module directory in the homeserver.yaml?

The Docker image has already access to the volume folder (/data/...).

Maybe the administrator could place his modules in /data/modules.

The actual loading happens in synapse\util\module_loader.py:

    modulename = provider.get("module")
    if not isinstance(modulename, str):
        raise ConfigError(
            "expected a string", path=itertools.chain(config_path, ("module",))
        )

    # We need to import the module, and then pick the class out of
    # that, so we split based on the last dot.
    module_name, clz = modulename.rsplit(".", 1)
    module = importlib.import_module(module_name)
    provider_class = getattr(module, clz)

    # Load the module config. If None, pass an empty dictionary instead
    module_config = provider.get("config") or {}

The real magic seems to happen in importlib.import_module (https://docs.python.org/3/library/importlib.html#importlib.import_module), which has a second parameter which might be useful.

The name argument specifies what module to import in absolute or relative terms (e.g. either pkg.mod or ..mod). If the name is specified in relative terms, then the package argument must be set to the name of the package which is to act as the anchor for resolving the package name (e.g. import_module('..mod', 'pkg.subpkg') will import pkg.mod).

Oct 04 '22 11:10 gauss-lvs-dev

Maybe the administrator could place his modules in /data/modules.

Something like this would likely work, but we might need to extend the Python path so that Python could find it (to import it).

Aug 28 '23 17:08 clokep

synapse synapse copied to clipboard

Identify, implement, and document a way to extend our Docker images

synapse
synapse copied to clipboard