"log" is not exported from module "dlt.pipeline.progress"
dlt version
1.12.3
Describe the problem
Creating a simple file:
from dlt.pipeline.progress import log
and running pyright against this:
❯ poetry run pyright test.py
/redacted/test.py
/redacted/test.py:1:35 - error: "log" is not exported from module "dlt.pipeline.progress"
Import from "dlt.common.runtime.collector" instead (reportPrivateImportUsage)
1 error, 0 warnings, 0 informations
Expected behavior
No pyright errror should result.
Steps to reproduce
See above.
Operating system
macOS
Runtime environment
Local
Python version
3.12
dlt data source
Not relevant.
dlt destination
No response
Other deployment details
No response
Additional information
No response
The same error arises for
from dlt.progress import log
@djudjuu pls take a look at this
@rudolfix This is caused by name shadowing. The type checker will complain despite imports working.
# dlt/__init__.py
# `dlt.pipeline` refers to `dlt/pipeline/__init__.py`, which contains `pipeline`
from dlt.pipeline import pipeline as _pipeline
# the name `dlt.pipeline` is set to `dlt.pipeline.pipeline`
pipeline = _pipeline
# export `dlt.pipeline` (which is `dlt.pipeline.pipeline`),
# making `dlt.pipeline` (`dlt/pipeline/__init__.py` inacessible)
__all__ = (
...,
"pipeline"
)
The name progress is from dlt.pipeline (the module at dlt/pipeline/__init__.py), which becomes inacessible. This causes oddities like
from dlt import progress # works
# start a new clean Python session
import dlt.progress # fails; this exists at dlt.pipeline.progress
Note
To debug this, you must restart Python each time to avoid import side-effects. Once you import anything from dlt, Python will look at sys.modules["dlt"] first.
I've looked into this, and for the moment I am not seeing a quick way out of it. I'll investigate more but I don't have the capacity to prioritize this high atm
@rudolfix This is caused by name shadowing. The type checker will complain despite imports working.
dlt/init.py
dlt.pipelinerefers todlt/pipeline/__init__.py, which containspipelinefrom dlt.pipeline import pipeline as _pipeline
the name
dlt.pipelineis set todlt.pipeline.pipelinepipeline = _pipeline
export
dlt.pipeline(which isdlt.pipeline.pipeline),making
dlt.pipeline(dlt/pipeline/__init__.pyinacessible)all = ( ..., "pipeline" ) The name
progressis fromdlt.pipeline(the module atdlt/pipeline/__init__.py), which becomes inacessible. This causes oddities likefrom dlt import progress # works
start a new clean Python session
import dlt.progress # fails; this exists at dlt.pipeline.progress
Note
To debug this, you must restart Python each time to avoid import side-effects. Once you import anything from
dlt, Python will look atsys.modules["dlt"]first.
I am wondering why all this import complexity is even necessary?
The issue you opened is valid. The challenge in this is backwards compatibility unfortunately. The existing code has some complexity and "doing things in a cleaner way" is likely to break users' code.
We'll need to wait for a major release of dlt to respect semantic versioning