python-build-standalone
python-build-standalone copied to clipboard
Use simple shebang launcher when the path is not complex
Wouldn't it make more sense to have this complicated launcher only if the Python path contains a space?
Because the Python startup time is already slow enough, I don't think having two extra processes (per startup) (
dirnameandrealpath) would help much. (Also it would break if one intends to use this in a minimal container / environment, where thecoreutilsor equivalent aren't present.)Now I don't exactly know if this launcher should be relocatable, but if so, I think there can be a way to detect in pure POSIX
shif the$0contains a space, and if so do this, else just leaving it alone.
Posted by @cipriancraciun in https://github.com/indygreg/python-build-standalone/pull/395#discussion_r1865382901
Just for reference, my last proposal that uses only POSIX compliant sh would be the following (not tested):
#!/bin/sh
''':'
_path_0="${0}"
## strip one /* suffix (non-eagerly) (equivalent to dirname)
_path="${_path_0%/*}"
if test "${_path_0}" == "${_path}" ; then
## there was no `/` found in `_path_0`
_path='.'
fi
exec -- "${_path}/python3.11d" "${0}" "${@}"
' '''
I'm very sympathetic to the startup time and sh only dependency arguments. I've seen that ''' trick before and it is very effective.
@zanieb is the state of the main branch with the initial shebang change safe to release? Or should we revert to get back to a known good state?
I think it's worth releasing the initial change regardless, it's a bug fix which I think takes precedence over the performance optimization here especially without benchmarks.
[...] especially without benchmarks
Here are some figures:
> hyperfine --runs 2000 --warmup 100 --shell none ./direct.py ./sh-strings.py ./sh-dirname.py
Benchmark 1: ./direct.py
Time (mean ± σ): 15.0 ms ± 2.3 ms [User: 12.7 ms, System: 2.2 ms]
Range (min … max): 11.7 ms … 19.9 ms 2000 runs
Benchmark 2: ./sh-strings.py
Time (mean ± σ): 15.6 ms ± 2.2 ms [User: 12.9 ms, System: 2.6 ms]
Range (min … max): 12.4 ms … 20.8 ms 2000 runs
Benchmark 3: ./sh-dirname.py
Time (mean ± σ): 18.1 ms ± 2.2 ms [User: 14.9 ms, System: 3.2 ms]
Range (min … max): 14.6 ms … 22.9 ms 2000 runs
Summary
./direct.py ran
1.04 ± 0.22 times faster than ./sh-strings.py
1.21 ± 0.24 times faster than ./sh-dirname.py
Where:
- in all cases the actual Python script is empty;
directuses plain#!/.../pythonsh-stringsuses the proposed variant above that doesn't rely on sub-processes;sh-dirnameuses the initial variant that relies on two subprocessesdirnameandrealpath;
Conclusions:
- using just the
shlauncher with strings adds about 4% of overhead as compared to no launcher; - using the two subprocesses adds about another 20% of overhead as compared to the plain launcher;
For completeness, here are the files:
-- direct.py
#!./python
-- sh-strings.py
#!/bin/sh
''':'
_path_0="${0}"
_path="${_path_0%/*}"
if test "${_path_0}" == "${_path}" ; then
_path='.'
fi
exec -- "${_path}/python" "${0}" "${@}"
' '''
-- sh-dirname.py
#!/bin/sh
'''exec' "$(dirname -- "$(realpath -- "$0")")/python" "$0" "$@"
' '''
I was more concerned with the portability concerns that were raised. But I guess we relied on dirname before and introduced realpath. So the dependency on non shell builtins was already there. This feels safe to ship.
The realpath dependency may be a real problem on old macOS versions (e.g., https://github.com/astral-sh/uv/issues/9532#issuecomment-2514962865) but yeah I don't think that's a critical concern.
Thanks for doing some benches @cipriancraciun. I feel like 2ms is very small when we're talking about invoking a Python application, but I agree it's worth changing regardless — it doesn't feel too costly to maintain. I'm not sure of other possible downsides?
I'm not sure of other possible downsides?
There is a potential issue: if the my-tool.py is symlinked to somewhere outside the Python bin, (i.e. /home/user/.bin/my-tool), then the current sh-only would fail to detect this, and would incorrectly use /home/user/.bin/python /home/user/.bin/my-tool ....
This could be patched by checking if ${0} is itself a symlink (i.e. test -h "${0}"), in which case one must use realpath or readlink (I don't know which is POSIX) to initialize _path. (The case when one of the folders on the path is a symlink shouldn't matter.)
However, I hope others think at other conrer-cases before deploying it.
Another possible corner-case, although less-likely, is when the Python executable is installed directly under /, thus the tool is invoked as /my-tool (for example in containers?). In this case, another case should be added that checks if the stripped path is the empty string, in which case it should be replaced with /.
The new prologue should thus be (not tested):
#!/bin/sh
''':'
_path_0="${0}"
if test -h "${_path_0}" ; then
_path_0="$( realpath -- "${_path_0}" )"
fi
_path="${_path_0%/*}"
if test -z "${_path}" ; then
_path='/'
elif test "${_path_0}" == "${_path}" ; then
_path='.'
fi
exec -- "${_path}/python" "${0}" "${@}"
' '''
It remains to be debated if Python is invoked with ${0} (i.e. the original path which might be a symlink) or ${_path_0}. (If one uses ${0} and the script imports files relative to itself, I think it would fail to find the Python files if it is a symlink.)
Python's startup time is horrible and it inflicts real pain on the ecosystem. However in this case we're talking about supplementary tools like pip, not python. pip is already super slow: see uv.
I want to say the startup time issue is not a consideration here because all the tools we're talking about are rarely executed or startup time is a trivial part of their total time.