dbt-core
dbt-core copied to clipboard
[Bug] dbt's custom exceptions inside a multiprocessing context hangs
Is this a new bug in dbt-core?
- [X] I believe this is a new bug in dbt-core
- [X] I have searched the existing issues, and I could not find an existing issue for this bug
Current Behavior
While debugging sqlfluff/sqlfluff#6037, dbt appears to hang if a dbt exception is raised. The exception appears to not be able to be pickled and prevents further execution.
Expected Behavior
The exceptions should implement __reduce__
to allow pickling and prevent hanging.
Steps To Reproduce
For these reproduction steps I'm using dbt-duckdb
, but applies to all adapters.
- Using the example models, make the first model raise a compilation error:
--my_first_dbt_model.sql
SELECT * from {{ ref("abc") }}
- Call
dbt run
from a python multiprocessing context.
import multiprocessing as mp
from dbt.cli.main import cli
def run_dbt():
ctx = cli.make_context(cli.name, ["run"])
cli.invoke(ctx)
with mp.Pool() as pool:
pool.apply(run_dbt)
Relevant log output
02:42:36 [WARNING]: Deprecated functionality
User config should be moved from the 'config' key in profiles.yml to the 'flags' key in dbt_project.yml.
02:42:36 Running with dbt=1.8.4
02:42:37 Registered adapter: duckdb=1.8.2
02:42:37 Unable to do partial parsing because of a version mismatch
02:42:38 Encountered an error:
Compilation Error
Model 'model.test_dbt.my_first_dbt_model' (project2/models/example/my_first_dbt_model.sql) depends on a node named 'abc' which was not found
Exception in thread Thread-8 (_handle_results):
Traceback (most recent call last):
File "/usr/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
self.run()
File "/usr/lib/python3.11/threading.py", line 982, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.11/multiprocessing/pool.py", line 579, in _handle_results
task = get()
^^^^^
File "/usr/lib/python3.11/multiprocessing/connection.py", line 251, in recv
return _ForkingPickler.loads(buf.getbuffer())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: TargetNotFoundError.__init__() missing 3 required positional arguments: 'node', 'target_name', and 'target_kind'
Environment
- OS: Ubuntu 20.04
- Python: 3.11.9
- dbt: 1.8.4
Which database adapter are you using with dbt?
other (mention it in "Additional Context")
Additional Context
As noted above, using dbt-duckdb
The main entry point for this error will most likely be the sqlfluff-templater-dbt
In sqlfluff, monkeypatching __reduce__
prevents the process from hanging.
# sqlfluff_templater_dbt/templater.py
def _dbt_exception_reduce(self):
return (
type(self),
tuple(
getattr(self, arg)
for arg in inspect.getfullargspec(self.__init__).args
if arg != "self"
),
)
DbtBaseException.__reduce__ = _dbt_exception_reduce