packaging-problems icon indicating copy to clipboard operation
packaging-problems copied to clipboard

non-hacky, wheel-compatible way to implement a post-install hook

Open glyph opened this issue 9 years ago • 120 comments

Sometimes you have to generate data dependent upon the installation environment where a package is being installed. For example, Twisted's plugin system has to make an index of all the files which have been installed into a given namespace package for quick lookup. There doesn't seem to be a way to do this properly in a wheel, since setup.py is not run upon installation.

glyph avatar Jun 15 '15 08:06 glyph

Imho namespace packages was a terrible idea, at least pre Python 3.3's implicit namespace packages and I'm still not sold on that either.

Why does Twisted feel the need to build up that plugin file index? Is the normal import procedure really that slow that this index provides a significant speedup?

daenney avatar Jun 15 '15 19:06 daenney

A post install hook is probably a reasonable request.

dstufft avatar Jun 15 '15 19:06 dstufft

The plugin file index is frequently used by command-line tools where start-up time is at a premium. It really varies a lot - on a hot cache, on an SSD, it's not noticeable, but on a loaded machine with spinning rust it can be quite pronounced.

In any case, the Twisted plugin system is just an example. Whether it is a good idea or not (and there's certainly room for debate there) it already exists, and it's very hard to make it behave correctly without user intervention in the universe of wheels. I was actually prompted to post this by a requirement from pywin32, not Twisted, but pywin32's use-case is a bit harder to explain and I don't understand it as well.

glyph avatar Jun 15 '15 21:06 glyph

The main concern I have with post-install hooks is the arbitrary code execution at installation time.

I'm a lot more comfortable with installer plugins that process additional metadata from the wheel's dist-info directory, as that reduces the scope of code to be reviewed that is likely to be run with elevated privileges.

In the absence of metadata 2.0, one possible scheme might be to allow metadata file handlers to be registered by way of entry points.

ncoghlan avatar Jun 15 '15 23:06 ncoghlan

As far as pywin32 goes, my recollection is that it has tools for registering Windows COM objects and other services at install time.

ncoghlan avatar Jun 15 '15 23:06 ncoghlan

The pywin32 case reminded me of another reason why I prefer metadata handling plugins: you only have to get uninstallation support right in one place, rather than in every project that needs a particular hook.

ncoghlan avatar Jun 15 '15 23:06 ncoghlan

I doubt @glyph cares if the post-install hook is done via a metadata plugin or via some script in the distribution (though I totally agree it should be via a plugin).

dstufft avatar Jun 15 '15 23:06 dstufft

I am not sure I'm clear on the difference. In order for this to work reasonably, it has to be something that can come from the distribution, not in pip or in a plugin installed into pip; how exactly that thing is registered certainly doesn't matter to me.

glyph avatar Jun 16 '15 02:06 glyph

@ncoghlan - ironically enough, it's easier to execute arbitrary code at installation time when one is definitely installing with elevated privileges; i.e. in a deb, an RPM, or a Windows installer. The reason that I want this functionality is that (especially with pip now doing wheels all the time) it is no longer possible to do this in a way which happens sensibly when you are running without elevated privileges, i.e. just pip installing some stuff into a virtualenv.

Registering COM plugins is another great example of a thing one might need to do at install time which can't really be put off; I wouldn't know if you could sensibly run COM plugins in a virtualenv though, that's @mhammond's area of expertise, not mine.

glyph avatar Jun 16 '15 02:06 glyph

pywin32's post-install script does things like creating shortcuts on the start menu, writing registry entries and registering COM objects - but it does adapt such that if can fallback to doing these things for just the current user if it doesn't have full elevation. It doesn't currently adapt to not having this post-install script run at all.

Bug as @glyph and I have been discussing, many users do not want or need some of this - if a user just wants to "casually" use pywin32 (ie, so importing win32api works) many would be happy. I think it's a fine compromise for pywin32 to have multiple ways of being installed - use the full-blown executable installer if you want the bells-and-whistles, but also support being installed via pip/wheel etc where you get nothing fancy. But even in this environment there are some files that should be copied (from inside the package to the site directory) for things to work as expected.

mhammond avatar Jun 16 '15 02:06 mhammond

Would it make sense for pywin32 to do things like start-menu and COM registration as extras, or separate packages? It seems, for example, that pythonwin could just be distributed separately, since I don't need an IDE to want to access basic win32 APIs ;)

glyph avatar Jun 16 '15 03:06 glyph

I only bring that up by way of pointing out even if everything were nicely separated out, those extras would still need their own post-install hooks, and each one would be doing something quite different.

glyph avatar Jun 16 '15 03:06 glyph

I don't think that someone seeking out the pywin32 installer is well served by splitting it into multiple packages to download and run. Conversely, I don't think people looking for a pip/wheel installation of pywin32 is going to be interested in the COM integration - they probably only want it as some package they care about depends on it. So IMO, a "full blown pywin32 installer" and an "automatically/scriptable minimal installation" will keep the vast majority of people happy.

mhammond avatar Jun 16 '15 03:06 mhammond

I definitely want to be able to put COM components into isolated environments, though, which is why I strongly feel that whatever the solution is for virtualenv has to be an instantiation of whatever is happening system-wide.

glyph avatar Jun 16 '15 03:06 glyph

Regarding the installer plugin based approach discussed in http://legacy.python.org/dev/peps/pep-0426/#support-for-metadata-hooks, the way I would expect a system like that to work is:

  • either twisted itself, or a suitably named separate component (e.g. "twisted-plugin") provides the necessary metadata processing utility that handles installation and uninstallation of Twisted plugins
  • any package that needs that metadata to be processed declares a normal dependency on the project that provides the installer plugin

From an end-user perspective, the default behaviour would be that the plugin gets downloaded and invoked automatically by the subsequent package installation. No extra steps, at least when running with normal user level privileges (as noted in https://bitbucket.org/pypa/pypi-metadata-formats/src/default/metadata-hooks.rst there's a reasonable case to be made that installing and running metadata hooks should be opt-in when running with elevated privileges).

While those links refer to the PEP 426 metadata format, I've come to the view that a better model would likely be to put extensions in separate files in the dist-info directory, and have metadata handlers be registered based on those filenames. Not coincidentally, that has the virtue of being an approach that could be pursued independently of progress on metadata 2.0.

The key difference between this approach and allowing arbitrary post-install and pre-uninstall scripts lies in the maintainability. Instead of the Twisted project saying "here is the code to add to your post-install and pre-uninstall scripts to procedurally register and unregister your plugins with Twisted", they'd instead say "declare your Twisted plugins in this format, and depend on this project to indicate you're providing a Twisted plugin that needs to be registered and unregistered". If the registration/unregistration process changes, the Twisted project can just update the centrally maintained metadata processing plugin, rather than having to try to propagate a code update across the entire Twisted plugin ecosystem. (I admit such a change is unlikely in the case of Twisted specifically, but it gives the general idea).

Likewise for pywin32 and updating it for changes to the way it interacts with the underlying OS - I strongly believe it's better to centralise the definition of that operating system interaction code in pywin32 itself, and use a declarative metadata based approach in the projects relying on it.

There's an interesting question around "Do post-install and pre-uninstall hooks declared by a package also get run for that particular package?", and my answer to that is "I'm not sure, as I can see pros and cons to both answers, so a decision either way would need to be use case driven".

(From my position working on Red Hat's hardware integration testing system, I got an interesting perspective on the way systemd ended up taking over the world of Linux init systems. My view is that one of the biggest reasons for its success is that it replaced cargo-culted imperative shell scripts for service configuration that resulted in highly variable quality in service implementations with declarative configuration files that ensure that every single Linux service managed via systemd provides a certain minimum level of functionality, and that new features, like coping with containerisation, can be implemented just by updating systemd, rather than hoping for every single sysvinit script in the world to get modified appropriately)

ncoghlan avatar Jun 16 '15 08:06 ncoghlan

I do see a need for this with the use of wheel convert to transform a window installer file into a wheel package. The installer would have previously run these scripts.

I'm converting a few because I prefer the wheels way of doing things and I have to work nearly 90% with windows.

The packages that I have seen are py2exe, pywin32, pyside, spyder to name ones which have a post-install script that is installed by the wheel package delivery method.

Or should it be that pip is extended to run these post install scripts, from meta-data in the wheel archive, the files are already there. As it does install technically and the wheel is just the package format.

guyverthree avatar Jun 26 '15 15:06 guyverthree

this would also allow https://pypi.python.org/pypi/django-unchained to ship a wheel

graingert avatar Jan 19 '17 16:01 graingert

I'd like this to be able to do things like register metadata, and install language files for Gtk.

For shoebot, we are making a library to help text editors plug us in. Those text editors need that extracted to some directory (that bit could be done from a WHL), but then some supplementary things need to happen, e.g. for gedit - install a .plugin file, register a .lang.

The destination of these, depends on the platform, I'd be happy - on installation to tell the installer the location of these so it could uninstall them if needed.

stuaxo avatar May 09 '18 11:05 stuaxo

I'm also interested in this. The use case I have in mind is installing git hooks for development environments, the way husky does via a devinstall hook. This would require pip to distinguish "development" installs, which it doesn't support AFAIK. So I'm not sure how this would look in practice.

sloria avatar Jun 20 '18 16:06 sloria

I need either something similar to this, or the ability to install files relative to the Python executable.

My use case is that I am trying to package the module BPY from Blender as a Python Module. Unfortunately, there is a folder, named after the current version of Blender (2.79, at the moment) that needs to be sibling to the Python executable.

Unfortunately, (on Windows and in my experience at least) that location can vary, based on what we have available in setuptools anyways. Currently I can describe this folder containing .py files as being 'scripts' per the setuptools documentation, and that works for most cases where the user is installing into a freshly created venv.

py -3.6-64 -m venv venv venv\Scripts\activate (venv)py -m pip install --upgrade pip (venv)py -m pip install bpy

  • Path to python.exe in venv: ./venv/Scripts/python.exe
  • Path to 2.79 folder: ./venv/Scripts/2.79
  • 2.79 folder is sibling to python.exe: True

But consider the case where someone is installing this into either their system install on Windows, or a Conda env, such as is the case here: https://github.com/TylerGubala/blenderpy/issues/13

py -3.6-64 -m pip install --upgrade pip py -m pip install bpy

  • Path to python.exe: {PYTHONPATH}/python.exe
  • Path to 2.79 folder: {PYTHONPATH}/Scripts/2.79
  • 2.79 folder is sibling to python.exe: False

The result of not having the 2.79 folder in the correct location (sibling to the executable) is that when you import the bpy module you will receive the following error due to it not being able to find the required libraries:

import bpy

AL lib: (EE) UpdateDeviceParams: Failed to set 44100hz, got 48000hz instead ModuleNotFoundError: No module named 'bpy_types' ModuleNotFoundError: No module named 'bpy_types' ModuleNotFoundError: No module named 'bpy_types' ModuleNotFoundError: No module named 'bpy_types' ModuleNotFoundError: No module named 'bpy_types' F0829 15:50:51.174837 3892 utilities.cc:322] Check failed: !IsGoogleLoggingInitialized() You called InitGoogleLogging() twice! *** Check failure stack trace: *** ERROR (bpy.rna): c:\users\tgubs.blenderpy\blender\source\blender\python\intern\bpy_rna.c:6662 pyrna_srna_ExternalType: failed to find 'bpy_types' module ERROR (bpy.rna): c:\users\tgubs.blenderpy\blender\source\blender\python\intern\bpy_rna.c:6662 pyrna_srna_ExternalType: failed to find 'bpy_types' module ERROR (bpy.rna): c:\users\tgubs.blenderpy\blender\source\blender\python\intern\bpy_rna.c:6662 pyrna_srna_ExternalType: failed to find 'bpy_types' module

Obviously, I can automate this via script, simply find the 2.79 folder where it is expected and move it sibling to python.exe if it's not there already. However this brings the installation up to 2 commands, one to install the bpy module from pypi and another to make the install correct. That's clumsy and a minor annoyance, and probably prone to people accidentally not doing the second command.

One might suggest that bpy be made simply an sdist only distribution, and handle the 2.79 folder specifically. However the sheer size of source code and precompiled libraries (especially on Windows: > 6GB to download!!!) makes this somewhat of a non-starter. Currently, the only thing that I think I might be able to do is to make bpy's install_requires reference a package on pypi whose sole purpose is to subclass setuptools.commands.install.install to perform the 2.79 movement for the user. However this seems like a band-aid and not too great of a solution.

Would be awesome as well for contributors to have a sensible way of doing this as well, a declarative way where everyone could expect the post_install_scripts to be placed such that setuptools and pip can understand about them irrespective of whether it is an sdist or bdist_wheel installation.

I hope that all makes sense. Unfortunately, for Blender in specific, I don't have control over the source code. There are many other issues and considerations that drive the motivation of having the 2.79 folder relative to the python executable that I won't go into here.

Suffice to say that a simple, post install script, OR being able to specify that the 2.79 folder must be placed relative to the executable, would have resolved this issue in a snap.

Hopefully that all makes sense!

TylerGubala avatar Sep 01 '18 19:09 TylerGubala

Maybe you could make that folder on first import?

On Sat, 1 Sep 2018, 20:18 Tyler Alden Gubala, [email protected] wrote:

I need either something similar to this, or the ability to install files relative to the Python executable.

My use case is that I am trying to package the module BPY from Blender as a Python Module. Unfortunately, there is a folder, named after the current version of Blender (2.79, at the moment) that needs to be sibling to the Python executable.

Unfortunately, (on Windows and in my experience at least) that location can vary, based on what we have available in setuptools anyways. Currently I can describe this folder containing .py files as being 'scripts' per the setuptools documentation, and that works for most cases where the user is installing into a freshly created venv.

py -3.6-64 -m venv venv venv\Scripts\activate (venv)py -m pip install --upgrade pip (venv)py -m pip install bpy

  • Path to python.exe in venv: ./venv/Scripts/python.exe
  • Path to 2.79 folder: ./venv/Scripts/2.79
  • 2.79 folder is sibling to python.exe: True

But consider the case where someone is installing this into either their system install on Windows, or a Conda env, such as is the case here: TylerGubala/blenderpy#13 https://github.com/TylerGubala/blenderpy/issues/13

py -3.6-64 -m pip install --upgrade pip py -m pip install bpy

  • Path to python.exe: {PYTHONPATH}/python.exe
  • Path to 2.79 folder: {PYTHONPATH}/Scripts/2.79
  • 2.79 folder is sibling to python.exe: False

The result of not having the 2.79 folder in the correct location (sibling to the executable) is that when you import the bpy module you will receive the following error due to it not being able to find the required libraries:

import bpy

AL lib: (EE) UpdateDeviceParams: Failed to set 44100hz, got 48000hz instead ModuleNotFoundError: No module named 'bpy_types' ModuleNotFoundError: No module named 'bpy_types' ModuleNotFoundError: No module named 'bpy_types' ModuleNotFoundError: No module named 'bpy_types' ModuleNotFoundError: No module named 'bpy_types' F0829 15:50:51.174837 3892 utilities.cc:322] Check failed: !IsGoogleLoggingInitialized() You called InitGoogleLogging() twice! *** Check failure stack trace: *** ERROR (bpy.rna): c:\users\tgubs.blenderpy\blender\source\blender\python\intern\bpy_rna.c:6662 pyrna_srna_ExternalType: failed to find 'bpy_types' module ERROR (bpy.rna): c:\users\tgubs.blenderpy\blender\source\blender\python\intern\bpy_rna.c:6662 pyrna_srna_ExternalType: failed to find 'bpy_types' module ERROR (bpy.rna): c:\users\tgubs.blenderpy\blender\source\blender\python\intern\bpy_rna.c:6662 pyrna_srna_ExternalType: failed to find 'bpy_types' module

Obviously, I can automate this via script, simply find the 2.79 folder where it is expected and move it sibling to python.exe if it's not there already. However this brings the installation up to 2 commands, one to install the bpy module from pypi and another to make the install correct. That's clumsy and a minor annoyance, and probably prone to people accidentally not doing the second command.

One might suggest that bpy be made simply an sdist only distribution, and handle the 2.79 folder specifically. However the sheer size of source code and precompiled libraries (especially on Windows: > 6GB to download!!!) makes this somewhat of a non-starter. Currently, the only thing that I think I might be able to do is to make bpy's install_requires reference a package on pypi whose sole purpose is to subclass setuptools.commands.install.install to perform the 2.79 movement for the user. However this seems like a band-aid and not too great of a solution.

Would be awesome as well for contributors to have a sensible way of doing this as well, a declarative way where everyone could expect the post_install_scripts to be placed such that setuptools and pip can understand about them irrespective of whether it is an sdist or bdist_wheel installation.

I hope that all makes sense. Unfortunately, for Blender in specific, I don't have control over the source code. There are many other issues and considerations that drive the motivation of having the 2.79 folder relative to the python executable that I won't go into here.

Suffice to say that a simple, post install script, OR being able to specify that the 2.79 folder must be placed relative to the executable, would have resolved this issue in a snap.

Hopefully that all makes sense!

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/pypa/packaging-problems/issues/64#issuecomment-417881258, or mute the thread https://github.com/notifications/unsubscribe-auth/AAZQTEfJEoUkdgQpM5D43TbVJuwf15_Mks5uWt2QgaJpZM4FDKNT .

graingert avatar Sep 01 '18 19:09 graingert

Is there a smart way to do that?

The module in question is a .pyd, so it's compiled from C code that I don't have control over.

I'd love to hear more about your suggestion!

TylerGubala avatar Sep 01 '18 19:09 TylerGubala

You can make a bpy package, and move the current module to bpy/_speedups.pyd then in bpy/__init__.py do all the setup and re-export bpy._speedups

graingert avatar Sep 01 '18 19:09 graingert

Maybe you could make that folder on first import?

In the general case, this is definitely not the right answer. If you're trying to package things up for distribution, the location relative to the python executable definitely shouldn't be writable all the time, and certainly shouldn't be writable by the user using the application just because it's writable by the user who installed it.

glyph avatar Sep 01 '18 20:09 glyph

I hope that all makes sense. Unfortunately, for Blender in specific, I don't have control over the source code. There are many other issues and considerations that drive the motivation of having the 2.79 folder relative to the python executable that I won't go into here.

I want to push back on this. Obviously I don't know what all the considerations here are, so I may well be missing something, but at first glance this seems like an unlikely and unreasonable requirement.

@glyph is right that the user may not have write access to the folder where the python package is, but ... The installer doesn't necessarily have those permissions either!

Is this just because you need the directory to be on the dll search path? There are lots of ways to manage that that don't require creating a non-standard python environment. Or what are the issues here?

njsmith avatar Sep 01 '18 20:09 njsmith

It's not the .dll search path; it's that the Blender C module code depends on Python packages that exist as normal .py files. These files must exist in a folder that matches the Blender version (2.79 at the time of writing) otherwise the .pyd module simply won't work, as it cannot find the .py files it depends upon, which are supposed to exist at {PYTHON_EXE_DIR}/2.79/....

The mechanism for finding said .py files is part of the C code that I do not have control over.

Is the worry here security?

TylerGubala avatar Sep 01 '18 21:09 TylerGubala

@TylerGubala I understand you're in a difficult position, where you don't control either how Blender works or how Python packaging works, and are just trying to figure out some way to make things work. But... it sounds like you're saying that the Blender devs made some decisions about how to arrange their Python environment that are incompatible with how Python normally works, so now Python needs to change to match Blender. Maybe it would be better to like, talk to the Blender devs and figure out some way they could make their code work in a regular Python environment?

njsmith avatar Sep 01 '18 23:09 njsmith

@njsmith Understandable. I'm not in any position to say why or how Python needs to change, nor is that my intent. I just wanted to outline my use case for consideration.

I think I may include a script that prospective users will run themselves after installing. If they don't have the permissions to move the files I'll delegate that to Windows to handle.

Maybe it would be better to like, talk to the Blender devs and figure out some way they could make their code work in a regular Python environment?

There is a rumor that Blender 3.0 is a Python module already, in which case it will work and play nice out of the box but that's a ways off, but is in their milestones I guess.

Until then I'm just, like you said, attempting to make it work as a pet project.

Thanks for your insight!

TylerGubala avatar Sep 01 '18 23:09 TylerGubala

Related: A hook if a package installation fails would be great, too. Many times there are known failures and a post-fail hook could give a users a hint, as to why their installation fails and how to resolve.

E.g.:

$ pip install my_package
Building wheels for collected packages: my_package
  Building wheel for my_package (setup.py) ... error
  [....]
  [Error]: gcc: command not found
-------------- Package Message: Install failed -------------- 
Some of the dependencies of this package build foreign code from source.
Therefore, setup depends on certain software to be available on your system,
e.g. a C compiler.

On debian systems run:
    sudo apt-get install build-essential

con-f-use avatar Mar 04 '19 15:03 con-f-use

The question here is about adding support for a hook in the wheel spec. There's no possibility of a package-specific failure when installing a wheel, as all that installing a wheel involves is unpacking a zip file. Your example is of a failure while building a wheel, which is a different issue (and in a PEP 517 world, one that's likely to be backend-specific).

pfmoore avatar Mar 04 '19 16:03 pfmoore