davos
davos copied to clipboard
Import packages in Python, even if they aren't installed!
Someone once told me that the night is dark and full of terrors. And tonight I am no knight. Tonight I am Davos the smuggler again. Would that you were an onion.
Introduction
The davos library provides Python with an additional keyword: smuggle.
The smuggle statement works just like the built-in import statement, with two major differences:
- You can
smugglea package without installing it first - You can
smugglea specific version of a package
Taken together, these two enhancements to import provide a powerful system for developing and sharing reproducible code that works across different users and environments.
Table of contents
- Table of contents
- Introduction (↑)
- Why would I want an alternative to
import? - Why not use virtual environments, containers, and/or virtual machines instead?
- Quick start
- Why would I want an alternative to
- Installation
- Latest Stable PyPI Release
- Latest GitHub Update
- Installing in Colaboratory
- Overview
- Smuggling Missing Packages
- Smuggling Specific Package Versions
- Use Cases
- Simplify sharing reproducible code & Python environments
- Guarantee your code always uses the latest version, release, or revision
- Compare behavior across package versions
- User guide
- The
smuggleStatement- Syntax
- Rules
- The Onion Comment
- Syntax
- Rules
- The
davosConfig- Reference
- Top-level Functions
- The
- How It Works: The
davosParser - Additional Notes
Why would I want an alternative to import?
In many cases, smuggle and import do the same thing—if you're
running code in the same environment you developed it in. But what if you want
to share a Jupyter notebook containing your code with
someone else? If the user (i.e., the "someone else" in this example) doesn't
have all of the packages your notebook imports, Python will raise an exception
and the code won't run. It's not a huge deal, of course, but it's inconvenient
(e.g., the user might need to pip-install the missing packages, restart their
kernel, re-run the code up to the point it crashed, etc.—possibly going
through this cycle multiple times until the thing finally runs).
A second (and more subtle) issue arises when the developer (i.e., the person
who wrote the code) used or assumed different versions of the imported
packages than what the user has installed in their environment. So maybe the
original author was developing and testing their code using pandas 1.3.5, but
the user hasn't upgraded their pandas installation since 0.25.0. Python will
happily "import pandas" in both cases, but any changes across those versions
might change what the developer's code actually does in the user's (different)
environment—or cause it to fail altogether.
The problem davos tries to solve is similar to the idea motivating virtual
environments, containers, and virtual machines: we want a way of replicating
the original developer's environment on the user's machine, to a sufficiently
good approximation that we can be "reasonably confident" that the code will
continue to behave as expected.
When you smuggle packages instead of importing them, it guarantees (for
whatever environment the code is running in) that the packages are importable,
even if they hadn't been installed previously. Under the hood, davos figures
out whether the package is available, and if not, it uses pip to download and
install anything that's missing (including missing dependencies). From that
point, after having automatically handled those sorts of dependency issues,
smuggle behaves just like import.
The second powerful feature of davos comes from another construct, called
"onion comments." These are like standard Python
comments, but they appear on the same line(s) as smuggle statements, and they
are formatted in a particular way. Onion comments provide a way of precisely
controlling how, when, and where packages are installed, how (or if) the system
checks for existing installations, and so on. A key feature is the ability to
specify exactly which version(s) of each package are imported into the current
workspace. When used in this way, davos enables authors to guarantee that the
same versions of the packages they developed their code with will also be
imported into the user's workspace at the appropriate times.
Why not use virtual environments, containers, and/or virtual machines instead?
Psst-- we'll let you in on a little secret: importing davos automatically
creates a virtual environment for your notebook. However, whereas setting up a
virtual environment is usually left to the user, davos handles the pesky
details for you, without you needing to think about them. Any packages you
smuggle via davos that aren't available in the notebook's original runtime
environment are installed into a new virtual environment. This ensures that
davos will not change the runtime environment (e.g., by installing new
packages, changing existing package versions, etc.).
By default, each notebook's virtual environment is stored in a hidden ".davos"
folder inside the current user's home directory. The default environment name
is computed to uniquely identify each notebook, according to its filename and
path. However, a notebook's virtual environment may be customized by setting
davos.project to any string that can be used as a valid folder name in the
user's operating system. This is useful for multi-notebook projects that share
dependencies (without needing to duplicate each package installation for each
notebook).
If you prefer, you can also disable davos's virtual environment
infrastructure by setting davos.project to None. Doing so will cause any
packages installed by davos to affect the notebook's runtime environment.
This is generally not recommended, as it can lead to unintended consequences
for other code that shares the runtime environment. That said, davos also
works great when used inside of (standard) virtual environments, containers,
and virtual machines.
There are a few additional specific advantages to davos that go beyond more
typical virtual environments, containers, and/or virtual machines. The main
advantage is that davos is very lightweight: importing davos into a
notebook-based environment unlocks all of its functionality without needed to
install, set up, and learn how to use additional stuff. There is none of the
typical overhead of setting up a new virtual environment (or container, virtual
machine, etc.), installing third-party tools, writing and sharing configuration
files, and so on. All of your code and its dependencies may be contained in a
single notebook file.
Okay... so how do I use this thing?
To turn a standard Jupyter (IPython) notebook, including a Google Colaboratory notebook, into a davos-enhanced notebook, just add two lines to the first cell:
%pip install davos
import davos
This will enable the smuggle keyword in your notebook environment. Then you can do things like:
# pip-install numpy v1.23.1, if needed
smuggle numpy as np # pip: numpy==1.23.1
# the smuggled package is fully imported and usable
arr = np.arange(15).reshape(3, 5)
# and the onion comment guarantees the desired version!
assert np.__version__ == '1.23.1'
Interested? Curious? Intrigued? Check out the table of contents for more details! You may also want to check out our paper for more formal descriptions and explanations.
Installation
Latest Stable PyPI Release
pip install davos
Latest GitHub Update
pip install git+https://github.com/ContextLab/davos.git
Installing in Colaboratory
To install davos in Google Colab, add a new cell to the top of your notebook with an
percentage sign (%) followed by one of the commands above (e.g., %pip install davos). You'll likely also want to import davos,
which enables the smuggle syntax. Run the cell to install davos on the runtime virtual machine.
Note: restarting the Colab runtime does not affect installed packages. However, if the runtime is "factory reset"
or disconnected due to reaching its idle timeout limit, you'll need to rerun the cell to reinstall davos on the fresh
VM instance.
Overview
The primary way to use davos is via the smuggle statement, which is made available
simply by running import davos. Like
the built-in import statement, the smuggle statement is used to
load packages, modules, and other objects into the current namespace. The main difference between the two is in how
they handle missing packages and specific package versions.
Smuggling Missing Packages
import requires that packages be installed before the start of the interpreter session. Trying to import a package
that can't be found locally will throw a
ModuleNotFoundError, and you'll have to
install the package from the command line, restart the Python interpreter to make the new package importable, and rerun
your code in full in order to use it.
The smuggle statement, however, can handle missing packages on the fly. If you smuggle a package that isn't
installed locally, davos will install it for you, make its contents available to Python's
import machinery, and load it into the namespace for immediate use.
You can control how davos installs missing packages by adding a special type of inline comment called an
"onion" comment next to a smuggle statement.
Smuggling Specific Package Versions
One simple but powerful use for onion comments is making smuggle statements version-sensitive.
Python doesn't provide a native, viable way to ensure a third-party package imported at runtime matches a specific
version or satisfies a particular version constraint.
Many packages expose their version info via a top-level __version__ attribute (see
PEP 396), and certain tools (such as the standard library's
importlib.metadata and
setuptools's
pkg_resources) attempt to parse version info from
installed distributions. However, using these to constrain imported package would require writing extra code to compare
version strings and still manually installing the desired version and restarting the interpreter any time an
invalid version is caught.
Additionally, for packages installed through a version control system (e.g., git), this would be insensitive to differences between revisions (e.g., commits) within the same semantic version.
davos solves these issues by allowing you to specify a specific version or set of acceptable versions for each
smuggled package. To do this, simply provide a
version specifier in an
onion comment next to the smuggle statement:
smuggle numpy as np # pip: numpy==1.23.1
from pandas smuggle DataFrame # pip: pandas>=1.0,<2.0
In this example, the first line will load numpy into the local namespace under the alias "np",
just as "import numpy as np" would. First, davos will check whether numpy is installed locally, and if so, whether
the installed version exactly matches 1.23.1. If numpy is not installed, or the installed version is anything
other than 1.23.1, davos will use the specified installer program, pip, to
install numpy==1.23.1 before loading the package.
Similarly, the second line will load the "DataFrame" object from the pandas library,
analogously to "from pandas import DataFrame". A local pandas version of 1.2.1 would be used, but a local version
of 2.1.1 would cause davos to replace it with a valid pandas version, as if you had manually run pip install pandas>=1.0,<2.0.
In both cases, the imported versions will fit the constraints specified in their onion comments,
and the next time numpy or pandas is smuggled with the same constraints, valid local installations will be found.
You can also force the state of a smuggled packages to match a specific VCS ref (branch, revision, tag, release, etc.). For example:
smuggle hypertools as hyp # pip: git+https://github.com/ContextLab/hypertools.git@98a3d80
will load hypertools (aliased as "hyp"), as the package existed
on GitHub, at commit
98a3d80. The general format for VCS references in
onion comments follows that of the
pip-install command. See the
notes on smuggling from VCS below for additional info.
And with a few exceptions, smuggling a specific package version will work even if the package has already been imported!
Note: davos v0.2.x supports IPython environments (e.g.,
Jupyter and Colaboratory notebooks) only. v0.3.x will add
support for "regular" (i.e., non-interactive) Python scripts.
Use Cases
Simplify sharing reproducible code & Python environments
Different versions of the same package can often behave quite differently—bugs are introduced and fixed, features are implemented and removed, support for Python versions is added and dropped, etc. Because of this, Python code that is meant to be reproducible (e.g., tutorials, demos, data analyses) is commonly shared alongside a set of fixed versions for each package used. And since there is no Python-native way to specify package versions at runtime (see above), this typically takes the form of a pre-configured development environment the end user must build themselves (e.g., a Docker container or conda environment), which can be cumbersome, slow to set up, resource-intensive, and confusing for newer users, as well as require shipping both additional specification files and setup instructions along with your code. And even then, a well-intentioned user may alter the environment in a way that affects your carefully curated set of pinned packages (such as installing additional packages that trigger dependency updates).
Instead, davos allows you to share code with one simple instruction: just pip install davos! Replace your import
statements with smuggle statements, pin package versions in onion comments, and let davos take care of the rest.
Beyond its simplicity, this approach ensures your predetermined package versions are in place every time your code is
run.
Guarantee your code always uses the latest version, release, or revision
If you want to make sure you're always using the most recent release of a certain package, davos makes doing so easy:
smuggle mypkg # pip: mypkg --upgrade
Or if you have an automation designed to test your most recent commit on GitHub:
smuggle mypkg # pip: git+https://username/reponame.git
Compare behavior across package versions
The ability to smuggle a specific package version even after a different version has been imported makes davos a
useful tool for comparing behavior across multiple versions of the same package, within the same interpreter session:
def test_my_func_unchanged():
"""Regression test for `mypkg.my_func()`"""
data = list(range(10))
smuggle mypkg # pip: mypkg==0.1
result1 = mypkg.my_func(data)
smuggle mypkg # pip: mypkg==0.2
result2 = mypkg.my_func(data)
smuggle mypkg # pip: git+https://github.com/MyOrg/mypkg.git
result3 = mypkg.my_func(data)
assert result1 == result2 == result3
Usage
The smuggle Statement
Syntax
The smuggle statement is meant to be used in place of
the built-in import statement and shares
its full syntactic definition:
smuggle_stmt ::= "smuggle" module ["as" identifier] ("," module ["as" identifier])*
| "from" relative_module "smuggle" identifier ["as" identifier]
("," identifier ["as" identifier])*
| "from" relative_module "smuggle" "(" identifier ["as" identifier]
("," identifier ["as" identifier])* [","] ")"
| "from" module "smuggle" "*"
module ::= (identifier ".")* identifier
relative_module ::= "."* module | "."+
NB: uses the modified BNF grammar notation described in
The Python Language Reference,
here; see
here for the lexical definition
of identifier
In simpler terms, any valid syntax for import is also valid for smuggle.
Rules
- Like
importstatements,smugglestatements are whitespace-insensitive, unless a lack of whitespace between two tokens would cause them to be interpreted as a different token:from os.path smuggle dirname, join as opj # valid from os . path smuggle dirname ,join as opj # also valid from os.path smuggle dirname, join asopj # invalid ("asopj" != "as opj") - Any context that would cause an
importstatement not to be executed will have the same effect on asmugglestatement:# smuggle matplotlib.pyplot as plt # not executed print('smuggle matplotlib.pyplot as plt') # not executed foo = """ smuggle matplotlib.pyplot as plt""" # not executed - Because the
davosparser is less complex than the full Python parser, there are two fairly non-disruptive edge cases where animportstatement would be syntactically valid but asmugglestatement would not:- The exec function
exec('from pathlib import Path') # executed exec('from pathlib smuggle Path') # raises SyntaxError - A one-line compound statement
clause:
if True: import random # executed if True: smuggle random # raises SyntaxError while True: import math; break # executed while True: smuggle math; break # raises SyntaxError for _ in range(1): import json # executed for _ in range(1): smuggle json # raises SyntaxError # etc...
- The exec function
- In IPython environments (e.g., Jupyter &
Colaboratory notebooks)
smugglestatements always load names into the global namespace:# example.ipynb import davos def import_example(): import datetime def smuggle_example(): smuggle datetime import_example() type(datetime) # raises NameError smuggle_example() type(datetime) # returns
The Onion Comment
An onion comment is a special type of inline comment placed on a line containing a smuggle statement. Onion comments
can be used to control how davos:
- determines whether the smuggled package should be installed
- installs the smuggled package, if necessary
Onion comments are also useful when smuggling a package whose distribution name (i.e., the name used when installing it) is different from its top-level module name (i.e., the name used when importing it). Take for example:
from sklearn.decomposition smuggle pca # pip: scikit-learn
The onion comment here (# pip: scikit-learn) tells davos that if "sklearn" does not exist
locally, the "scikit-learn" package should be installed.
Syntax
Onion comments follow a simple but specific syntax, inspired in part by the type comment syntax introduced in PEP 484. The following is a loose (pseudo-)syntactic definition for an onion comment:
onion_comment ::= "#" installer ":" install_opt* pkg_spec install_opt*
installer ::= ("pip" | "conda")
pkg_spec ::= identifier [version_spec]
NB: uses the modified BNF grammar notation described in
The Python Language Reference,
here; see
here for the lexical definition
of identifier
where installer is the program used to install the package; install_opt is any option accepted by the installer's
"install" command; and version_spec may be a
version specifier defined by
PEP 440 followed by a
version string, or an alternative syntax valid
for the given installer program. For example, pip uses specific syntaxes for
local,
editable, and
VCS-based installation.
Less formally, an onion comment simply consists of two parts, separated by a colon:
- the name of the installer program (e.g.,
pip) - arguments passed to the program's "install" command
Thus, you can essentially think of writing an onion comment as taking the full shell command you would run to install the package, and replacing "install" with ":". For instance, the command:
pip install -I --no-cache-dir numpy==1.23.1 -vvv --timeout 30
is easily translated into an onion comment as:
smuggle numpy # pip: -I --no-cache-dir numpy==1.23.1 -vvv --timeout 30
In practice, onion comments are identified as matches for the regular expression:
#+ *(?:pip|conda) *: *[^#\n ].+?(?= +#| *\n| *$)
NB: support for installing smuggled packages via
conda will be added in v0.2. For v0.1,
"pip" should be used exclusively.
Note: support for installing smuggled packages via the conda package manager
will be added in v0.2. For v0.1, onion comments should always specify "pip" as the installer program.
Rules
- An onion comment must be placed on the same line as a
smugglestatement; otherwise, it is not parsed:# assuming the dateutil package is not installed... # pip: python-dateutil # <-- has no effect smuggle dateutil # raises InstallerError (no "dateutil" package exists) smuggle dateutil # raises InstallerError (no "dateutil" package exists) # pip: python-dateutil # <-- has no effect smuggle dateutil # pip: python-dateutil # installs "python-dateutil" package, if necessary - An onion comment may be followed by unrelated inline comments as long as they are separated by at least one space:
smuggle tqdm # pip: tqdm>=4.46,<4.60 # this comment is ignored smuggle tqdm # pip: tqdm>=4.46,<4.60 # so is this one smuggle tqdm # pip: tqdm>=4.46,<4.60# but this comment raises OnionArgumentError - An onion comment must be the first inline comment immediately following a
smugglestatement; otherwise, it is not parsed:
This also allows you to easily "comment out" onion comments:smuggle numpy # pip: numpy!=1.19.1 # <-- guarantees smuggled version is *not* v1.19.1 smuggle numpy # has no effect --> # pip: numpy==1.19.1smuggle numpy ## pip: numpy!=1.19.1 # <-- has no effect - Onion comments are generally whitespace-insensitive, but installer arguments must be separated by at least one space:
from umap smuggle UMAP # pip: umap-learn --user -v --no-clean # valid from umap smuggle UMAP#pip:umap-learn --user -v --no-clean # also valid from umap smuggle UMAP # pip: umap-learn --user-v--no-clean # raises OnionArgumentError - Onion comments have no effect on standard library modules:
smuggle threading # pip: threading==9999 # <-- has no effect - When smuggling multiple packages with a single
smugglestatement, an onion comment may be used to refer to the first package listed:smuggle nilearn, nibabel, nltools # pip: nilearn==0.7.1 - If multiple separate
smugglestatements are placed on a single line, an onion comment may be used to refer to the last statement:smuggle gensim; smuggle spacy; smuggle nltk # pip: nltk~=3.5 --pre - For multiline
smugglestatements, an onion comment may be placed on the first line:
... or on the last line:from scipy.interpolate smuggle ( # pip: scipy==1.6.3 interp1d, interpn as interp_ndgrid, LinearNDInterpolator, NearestNDInterpolator, )
... though the first line takes priority:from scipy.interpolate smuggle (interp1d, # this comment has no effect interpn as interp_ndgrid, LinearNDInterpolator, NearestNDInterpolator) # pip: scipy==1.6.3
... and all comments not on the first or last line are ignored:from scipy.interpolate smuggle ( # pip: scipy==1.6.3 # <-- this version is installed interp1d, interpn as interp_ndgrid, LinearNDInterpolator, NearestNDInterpolator, ) # pip: scipy==1.6.2 # <-- this comment is ignoredfrom scipy.interpolate smuggle ( interp1d, # pip: scipy==1.6.3 # <-- ignored interpn as interp_ndgrid, LinearNDInterpolator, # unrelated comment # <-- ignored NearestNDInterpolator ) # pip: scipy==1.6.2 # <-- parsed - The onion comment is intended to describe how a specific smuggled package should be installed if it is not found
locally, in order to make it available for immediate use. Therefore, installer options that either (A) install
packages other than the smuggled package and its dependencies (e.g., from a specification file), or (B) cause the
smuggled package not to be installed, are disallowed. The options listed below will raise an
OnionArgumentError:-h,--help-r,--requirement-V,--version
The davos Config
The davos config object stores options and data that affect how davos behaves. After importing davos, the config
instance (a singleton) for the current session is available as davos.config, and its various fields are accessible as
attributes. The config object exposes a mixture of writable and read-only fields. Most davos.config attributes can be
assigned values to control aspects of davos behavior, while others are available for inspection but are set and used
internally. Additionally, certain config fields may be writable in some situations but not others (e.g. only if the
importing environment supports a particular feature). Once set, davos config options last for the lifetime of the
interpreter (unless updated); however, they do not persist across interpreter sessions. A full list of davos config
fields is available below:
Reference
| Field | Description | Type | Default | Writable? |
|---|---|---|---|---|
active |
Whether or not the davos parser should be run on subsequent input (cells, in Jupyter/Colab notebooks). Setting to True activates the davos parser, enables the smuggle keyword, and injects the smuggle() function into the user namespace. Setting to False deactivates the davos parser, disables the smuggle keyword, and removes "smuggle" from the user namespace (if it holds a reference to the smuggle() function). See How it Works for more info. |
bool |
True |
✅ |
auto_rerun |
If True, when smuggling a previously-imported package that cannot be reloaded (see Smuggling packages with C-extensions), davos will automatically restart the interpreter and rerun all code up to (and including) the current smuggle statement. Otherwise, issues a warning and prompts the user with buttons to either restart/rerun or continue running. |
bool |
False |
✅ (Jupyter notebooks only) |
confirm_install |
Whether or not davos should require user confirmation ([y/n] input) before installing a smuggled package |
bool |
False |
✅ |
environment |
A label describing the environment into which davos was running. Checked internally to determine which interchangeable implementation functions are used, whether certain config fields are writable, and various other behaviors |
Literal['Python', 'IPython<7.0', 'IPython>=7.0', 'Colaboratory'] |
N/A | ❌ |
ipython_shell |
The global IPython interactive shell instance | IPython.core.interactiveshell.InteractiveShell |
N/A | ❌ |
noninteractive |
Set to True to run davos in non-interactive mode (all user input and confirmation will be disabled). NB:1. Setting to True disables confirm_install if previously enabled 2. If auto_rerun is False in non-interactive mode, davos will throw an error if a smuggled package cannot be reloaded |
bool |
False |
✅ (Jupyter notebooks only) |
pip_executable |
The path to the pip executable used to install smuggled packages. Must be a path (str or pathlib.Path) to a real file. Default is programmatically determined from Python environment; falls back to sys.executable -m pip if executable can't be found |
str |
pip exe path or sys.executable -m pip |
✅ |
smuggled |
A cache of packages smuggled during the current interpreter session. Formatted as a dict whose keys are package names and values are the (.split() and ';'.join()ed) onion comments. Implemented this way so that any non-whitespace change to installer arguments re-installation |
dict[str, str] |
{} |
❌ |
suppress_stdout |
If True, suppress all unnecessary output issued by both davos and the installer program. Useful when smuggling packages that need to install many dependencies and therefore generate extensive output. If the installer program throws an error while output is suppressed, both stdout & stderr will be shown with the traceback |
bool |
False |
✅ |
Top-level Functions
davos also provides a few convenience for reading/setting config values:
-
davos.activate()Activate thedavosparser, enable thesmugglekeyword, and inject thesmuggle()function into the namespace. Equivalent to settingdavos.config.active = True. See How it Works for more info. -
davos.deactivate()Deactivate thedavosparser, disable thesmugglekeyword, and remove the namesmugglefrom the namespace if (and only if) it refers to thesmuggle()function. Ifsmugglehas been overwritten with a different value, the variable will not be deleted. Equivalent to settingdavos.config.active = False. See How it Works for more -
info.
-
davos.is_active()Return the current value ofdavos.config.active. -
davos.configure(**kwargs)Set multipledavos.configfields at once by passing values as keyword arguments, e.g.:import davos davos.configure(active=False, noninteractive=True, pip_executable='/usr/bin/pip3')is equivalent to:
import davos davos.active = False davos.noninteractive = True davos.pip_executable = '/usr/bin/pip3'
How It Works: The davos Parser
Functionally, importing davos appears to enable a new Python keyword, "smuggle". However, davos doesn't actually
modify the rules or reserved keywords used by
Python's parser and lexical analyzer in order to do so—in fact, modifying the Python grammar is not possible at
runtime and would require rebuilding the interpreter. Instead, in IPython
enivonments like Jupyter and
Colaboratory notebooks, davos implements the smuggle
keyword via a combination of namespace injections and its own (far simpler) custom parser.
The smuggle keyword can be enabled and disabled at will by "activating" and "deactivating" davos (see the
davos Config Reference and Top-level Functions, above). When davos is
imported, it is automatically activated by default. Activating davos triggers two things:
- The
smuggle()function is injected into theIPythonuser namespace - The
davosparser is registered as a custom input transformer
IPython preprocesses all executed code as plain text before it is sent to the Python parser in order to handle
special constructs like %magic and
!shell commands. davos
hooks into this process to transform smuggle statements into syntactically valid Python code. The davos
parser uses this regular expression to match each
line of code containing a smuggle statement (and, optionally, an onion comment), extracts information from its text,
and replaces it with an analogous call to the smuggle() function. Thus, even though the code visible to the user may
contain smuggle statements, e.g.:
smuggle numpy as np # pip: numpy>1.16,<=1.24 -vv
the code that is actually executed by the Python interpreter will not:
smuggle(name="numpy", as_="np", installer="pip", args_str="""numpy>1.16,<=1.24 -vv""", installer_kwargs={'editable': False, 'spec': 'numpy>1.16,<=1.24', 'verbosity': 2})
The davos parser can be deactivated at any time, and doing so triggers the opposite actions of activating it:
- The name "
smuggle" is deleted from theIPythonuser namespace, unless it has been overwritten and no longer refers to thesmuggle()function - The
davosparser input transformer is deregistered.
Note: in Jupyter and Colaboratory notebooks, IPython parses and transforms all text in a cell before sending it
to the kernel for execution. This means that importing or activating davos will not make the smuggle statement
available until the next cell, because all lines in the current cell were transformed before the davos parser was
registered. However, deactivating davos disables the smuggle statement immediately—although the davos
parser will have already replaced all smuggle statements with smuggle() function calls, removing the function from
the namespace causes them to throw NameError.
Additional Notes
-
Reimplementing installer programs' CLI parsers
The
davosparser extracts info from onion comments by passing them to a (slightly modified) reimplementation of their specified installer program's CLI parser. This is somewhat redundant, since the arguments will eventually be re-parsed by the actual installer program if the package needs to be installed. However, it affords a number of advantages, such as:- detecting errors early during the parser phase, before spending any time running code above the line containing the
smugglestatement - preventing shell injections in onion comments—e.g.,
#pip: --upgrade numpy && rm -rf /fails due to theOnionParser, but would otherwise execute successfully. - allowing certain installer arguments to temporarily influence
davosbehavior while smuggling the current package (see Installer options that affectdavosbehavior below for specific info)
- detecting errors early during the parser phase, before spending any time running code above the line containing the
-
Installer options that affect
davosbehaviorPassing certain options to the installer program via an onion comment will also affect the corresponding
smugglestatement in a predictable way:-
--force-reinstall|-I,--ignore-installed|-U,--upgradeThe package will be installed, even if it exists locally
-
Disables input prompts, analogous to temporarily setting
davos.config.noninteractivetoTrue. Overrides value ofdavos.config.confirm_install. -
--src <dir>|-t,--target <dir>Prepends
<dir>tosys.pathif not already present so the package can be imported.
-
-
Smuggling packages with C-extensions
Some Python packages that rely heavily on custom data types implemented via C-extensions (e.g.,
numpy,pandas) dynamically generate modules defining various C functions and data structures, and link them to the Python interpreter when they are first imported. Depending on how these objects are initialized, they may not be subject to normal garbage collection, and persist despite their reference count dropping to zero. This can lead to unexpected errors when reloading the Python module that creates them, particularly if their dynamically generated source code has been changed (e.g., because the reloaded package is a newer version).This can occasionally affect
davos's ability tosmugglea new version of a package (or dependency) that was previously imported. To handle this,davosfirst checks each package it installs againstsys.modules. If a different version has already been loaded by the interpreter,davoswill attempt to replace it with the requested version. If this fails,davoswill restore the old package version in memory, while replacing it with the new package version on disk. This allows subsequent code that uses the non-reloadable module to still execute in most cases, while dependency checks for other packages run against the updated version. Then, depending on the value ofdavos.config.auto_rerun,davoswill either either automatically restart the interpreter to load the updated package, prompt you to do so, or raise an exception. -
from...import... statements and reloading modulesThe Python docs for
importlib.reload()include the following caveat:If a module imports objects from another module using
from…import…, callingreload()for the other module does not redefine the objects imported from it — one way around this is to re-execute thefromstatement, another is to useimportand qualified names (module.name) instead.The same applies to smuggling packages or modules from which objects have already been loaded. If object
namefrom modulemodulewas loaded using eitherfrom module import nameorfrom module smuggle name, subsequently runningsmuggle module # pip --upgradewill in fact install and load an upgraded version ofmodule, but the thenameobject will still be that of the old version! To fix this, you can simply runfrom module smuggle nameeither instead in lieu of or aftersmuggle module. -
Smuggling packages from version control systems
The first time during an interpreter session that a given package is installed from a VCS URL, it is assumed not to be present locally, and is therefore freshly installed.
pipclones non-editable VCS repositories into a temporary directory, runssetup.py install, and then immediately deletes them. Since no information is retained about the state of the repository at installation, it is impossible to determine whether an existing package satisfies the state (i.e., branch, tag, commit hash, etc.) requested for smuggled package.