marimo
marimo copied to clipboard
On module change "autorun" with nested imports
Describe the bug
Not sure it is intended behavior or not, but it surprised me that some modules depending on a changed file were not automatically reloaded (using the autorun setting for on module change).
my_package:
### core.py ###
from . import constants
def f():
return constants.A
### constants.py ###
A = 1
my_notebook (in another folder):
from my_package import core
core.f()
The autoreload works well if I directly change the core.py file. But if I change the constants.py file to A = 2 instead, the notebook is not updated. Is there a way to force the reload of every modules depending on a separate module that changed?
Will you submit a PR?
- [ ] Yes
Environment
{
"marimo": "0.17.0",
"editable": false,
"location": "/home/circleci/project/.venv/lib/python3.12/site-packages/marimo",
"OS": "Linux",
"OS Version": "6.14.0-111032-tuxedo",
"Processor": "x86_64",
"Python Version": "3.12.8",
"Locale": "en_US",
"Binaries": {
"Browser": "--",
"Node": "v18.18.2"
},
"Dependencies": {
"click": "8.2.1",
"docutils": "0.22.2",
"itsdangerous": "2.2.0",
"jedi": "0.19.2",
"markdown": "3.8.2",
"narwhals": "2.1.2",
"packaging": "25.0",
"psutil": "7.0.0",
"pygments": "2.19.2",
"pymdown-extensions": "10.16.1",
"pyyaml": "6.0.2",
"starlette": "0.48.0",
"tomlkit": "0.13.3",
"typing-extensions": "4.14.1",
"uvicorn": "0.38.0",
"websockets": "15.0.1"
},
"Optional Dependencies": {
"altair": "5.5.0",
"anywidget": "0.9.18",
"nbformat": "5.10.4",
"pandas": "2.3.1",
"pyarrow": "21.0.0",
"loro": "1.8.1",
"pytest": "8.4.1",
"ruff": "0.12.9"
},
"Experimental Flags": {}
}
Code to reproduce
No response
Looking at marimo's internals to find the root cause, I think it is because of silently handled exceptions raised by modulefinder.ModuleFinder(excludes=excludes).run_script(module.__file__). I ran into two kinds of exceptions:
ImportError: relative importpath too deep: maybe linked to relative imports?AttributeError: 'NoneType' object has no attribute 'is_package': regardingspec.loader(looks similar to this issue https://github.com/python/cpython/issues/84530). Likely related to thenumpycomment in the code below, but not the primary cause of my issue.
Relevant code: https://github.com/marimo-team/marimo/blob/2a8bff4bf36ac580d6a996265d14d31a380b1f3b/marimo/_runtime/reload/autoreload.py#L111-L127
Using:
fqname = module.__spec__.name if module.__spec__ else module.__name__
finder.import_hook(fqname)
instead of
finder.run_script(module.__file__)
seems to solve the relative import issue. WDYT?
@mthiboust changing to module.__spec__.name sounds good to me since it fixes your issue. we would definitely appreciate the contribution if you are up for it.
@mthiboust Thanks for digging in. Are there any tradeoffs involved in switching to import_hook instead of run_script? import_hook doesn't seem to be documented in the public API.
Looking at the source, run_script seems to do a lot more analysis.
Instead of replacing run_script, can we fallback to import_hook in certain cases?
Instead of replacing
run_script, can we fallback toimport_hookin certain cases?
Sounds reasonable.
BTW, I ran into another issue: only the updated files are reloaded, not the modules depending on them. It means that the automatic reload won't take file updates into account if they don't directly occur in the imported modules. This one looks trickier to fix because modulefinder.ModuleFinder only returns a list of dependencies instead of a graph of dependencies.
It would be great to have a modified version of modulefinder.ModuleFinder:
- returning a graph of dependencies
- supporting an
includesparam (in addition to the existingexcludesparam) in order to focus on a subset of dependencies (e.g. only the current project in most use cases)
@mthiboust ruff analyze could be good useful tool (although it's API says it is not stable and an optional dep). but if you'd want to create a proof-of-concept with it, we could make it the main ModuleFinder and then have a fallback ModuleFinder to what we have today
I didn't know about ruff analyze graph --direction dependents. It is very efficient and supports both absolute/relative imports. However, it seems to only works with files from the current project, not with third party dependencies in the site-packages folder. Even if it is a limitation compared to current implementation, I think that it would cover most use cases where we don't need the automatic reload for file updates in third party dependencies.
A quick summary of current status.
After experimenting a bit more, I am not able to reproduce a simple MWE having the ImportError: relative importpath too deep error. I had this error on a complex private repo using relative imports with many levels of subpackages. For this specific case, using finder.import_hook instead of finder.run_script avoided an exception and the code was able to get the list of dependencies for my imported module. But anyway, it was not sufficient to be useful because nested imports are not reloaded anyway. This is a more fundamental issue.
For this, we would need to retrieve a dependency graph. As you suggested, there are two main paths:
- Via a reimplementation of python
modulefinderstd lib to return a dependency graph instead of a flat list of dependencies. If following this path, theImportError: relative importpath too deeperror should be investigated as well. - Via
ruff analyze graphthat already returns a dependency graph (but not supporting installed packages insite-packagesfolder). This path already supports the relative imports.
I don't think it is worth spending time on reimplementing a graph-aware version of python modulefinder because of many low-level complexities (relative imports being one, but also the missing is_package attribute of spec.loader for some packages, and probably other unknown surprises on the way).
I opened a proof-of-concept PR implementing the ruff path. It is working well on a few examples I have, but the PR would need more work.