dvc
dvc copied to clipboard
Logger omits unexpected exception during pretty printing, hindering issue diagnosis and resolution
Bug Report
Description
The logger in dvc/logger.py incorrectly handles exceptions that occur during the process of pretty printing error messages. Specifically, when an unexpected exception occurs, the logger fails to include the exception in the log output, resulting in incomplete traceback. This makes it difficult to diagnose issues and understand the root cause of problems.
Reproduce
-
Replace the following lines in
dvc/logger.py: https://github.com/iterative/dvc/blob/27a8222e2a05e70deb78e641c4f4824dcdcc10c6/dvc/logger.py#L147-L153with the following code:
if hasattr(exc, "__pretty_exc__"): try: raise RuntimeError("Some unexpected error") # self.emit_pretty_exception(exc, verbose=_is_verbose()) # if not _is_verbose(): # return except Exception: # noqa: BLE001, S110 pass -
Initialize a DVC project
dvc init --no-scm
tee -a dvc.yaml << END
stages:
stage1:
foreach: [1,2,3]
do:
outs:
- ${item}
END
- Run the
dvc reprocommand with the--verboseflag turned on, like so:
dvc repro --verbose
Actual output
(.venv) ➜ .local git:(master) ✗ dvc repro --verbose
2023-12-26 17:14:56,673 DEBUG: v3.36.1.dev61+gb73754edd, CPython 3.11.2 on <REDACTED>
2023-12-26 17:14:56,673 DEBUG: command: repro --verbose
2023-12-26 17:14:56,848 ERROR: './dvc.yaml' validation failed: required key not provided @ data['stages']['stage1']['do']['cmd']
Traceback (most recent call last):
File "/home/sid/workspace/dvc/dvc/utils/strictyaml.py", line 267, in validate
return schema(data)
^^^^^^^^^^^^
File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/voluptuous/schema_builder.py", line 287, in __call__
return self._compiled([], data)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/voluptuous/schema_builder.py", line 611, in validate_dict
return base_validate(path, iteritems(data), out)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/voluptuous/schema_builder.py", line 449, in validate_mapping
raise er.MultipleInvalid(errors)
voluptuous.error.MultipleInvalid: required key not provided @ data['stages']['stage1']['do']['cmd']
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/sid/workspace/dvc/dvc/cli/__init__.py", line 211, in main
ret = cmd.do_run()
^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/cli/command.py", line 27, in do_run
return self.run()
^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/commands/repro.py", line 13, in run
stages = self.repo.reproduce(**self._common_kwargs, **self._repro_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/repo/__init__.py", line 65, in wrapper
return f(repo, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/repo/scm_context.py", line 152, in run
return method(repo, *args, **kw)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/repo/reproduce.py", line 246, in reproduce
stages = collect_stages(self, targets_list, recursive=recursive, glob=glob)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/repo/reproduce.py", line 46, in collect_stages
stages.extend(repo.stage.collect(target, recursive=recursive, glob=glob))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/repo/stage.py", line 354, in collect
stages = self.from_target(target, glob=glob)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/repo/stage.py", line 208, in from_target
return self.load_all(path=path, name=name, accept_group=accept_group, glob=glob)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/repo/stage.py", line 275, in load_all
stages = dvcfile.stages # type: ignore[attr-defined]
^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/funcy/objects.py", line 25, in __get__
res = instance.__dict__[self.fget.__name__] = self.fget(instance)
^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/dvcfile.py", line 302, in stages
return self.LOADER(self, self.contents, self.lockfile_contents)
^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/funcy/objects.py", line 25, in __get__
res = instance.__dict__[self.fget.__name__] = self.fget(instance)
^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/dvcfile.py", line 287, in contents
return self._load()[0]
^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/dvcfile.py", line 151, in _load
return self._load_yaml(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/dvcfile.py", line 162, in _load_yaml
return strictyaml.load(
^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/utils/strictyaml.py", line 295, in load
validate(data, schema, text=text, path=path, rev=rev)
File "/home/sid/workspace/dvc/dvc/utils/strictyaml.py", line 269, in validate
raise YAMLValidationError(exc, path, text, rev=rev) from exc
dvc.utils.strictyaml.YAMLValidationError: './dvc.yaml' validation failed
2023-12-26 17:14:56,881 DEBUG: Analytics is disabled.
Expected
Traceback that includes the unexpected error, like so:
Expected output
$ dvc repro --verbose
2023-12-26 17:39:05,504 DEBUG: v3.36.1.dev61+gb73754edd.d20231226, CPython 3.11.2 on <REDACTED>
2023-12-26 17:39:05,504 DEBUG: command: repro --verbose
2023-12-26 17:39:05,687 ERROR: './dvc.yaml' validation failed: required key not provided @ data['stages']['stage1']['do']['cmd'] - Some unexpected error
Traceback (most recent call last):
File "/home/sid/workspace/dvc/dvc/utils/strictyaml.py", line 267, in validate
return schema(data)
^^^^^^^^^^^^
File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/voluptuous/schema_builder.py", line 287, in __call__
return self._compiled([], data)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/voluptuous/schema_builder.py", line 611, in validate_dict
return base_validate(path, iteritems(data), out)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/voluptuous/schema_builder.py", line 449, in validate_mapping
raise er.MultipleInvalid(errors)
voluptuous.error.MultipleInvalid: required key not provided @ data['stages']['stage1']['do']['cmd']
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/sid/workspace/dvc/dvc/cli/__init__.py", line 211, in main
ret = cmd.do_run()
^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/cli/command.py", line 27, in do_run
return self.run()
^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/commands/repro.py", line 13, in run
stages = self.repo.reproduce(**self._common_kwargs, **self._repro_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/repo/__init__.py", line 65, in wrapper
return f(repo, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/repo/scm_context.py", line 152, in run
return method(repo, *args, **kw)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/repo/reproduce.py", line 246, in reproduce
stages = collect_stages(self, targets_list, recursive=recursive, glob=glob)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/repo/reproduce.py", line 46, in collect_stages
stages.extend(repo.stage.collect(target, recursive=recursive, glob=glob))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/repo/stage.py", line 354, in collect
stages = self.from_target(target, glob=glob)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/repo/stage.py", line 208, in from_target
return self.load_all(path=path, name=name, accept_group=accept_group, glob=glob)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/repo/stage.py", line 275, in load_all
stages = dvcfile.stages # type: ignore[attr-defined]
^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/funcy/objects.py", line 25, in __get__
res = instance.__dict__[self.fget.__name__] = self.fget(instance)
^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/dvcfile.py", line 302, in stages
return self.LOADER(self, self.contents, self.lockfile_contents)
^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/.venv/lib/python3.11/site-packages/funcy/objects.py", line 25, in __get__
res = instance.__dict__[self.fget.__name__] = self.fget(instance)
^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/dvcfile.py", line 287, in contents
return self._load()[0]
^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/dvcfile.py", line 151, in _load
return self._load_yaml(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/dvcfile.py", line 162, in _load_yaml
return strictyaml.load(
^^^^^^^^^^^^^^^^
File "/home/sid/workspace/dvc/dvc/utils/strictyaml.py", line 295, in load
validate(data, schema, text=text, path=path, rev=rev)
File "/home/sid/workspace/dvc/dvc/utils/strictyaml.py", line 269, in validate
raise YAMLValidationError(exc, path, text, rev=rev) from exc
dvc.utils.strictyaml.YAMLValidationError: './dvc.yaml' validation failed
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/sid/workspace/dvc/dvc/logger.py", line 149, in emit
raise RuntimeError("Some unexpected error")
RuntimeError: Some unexpected error
2023-12-26 17:39:05,689 DEBUG: Analytics is disabled.
Output of dvc doctor:
$ dvc doctor
DVC version: 3.36.1.dev61+gb73754edd.d20231226
----------------------------------------------
Platform: Python 3.11.2 on <REDACTED>
Subprojects:
dvc_data = 3.3.0
dvc_objects = 3.0.0
dvc_render = 1.0.0
dvc_task = 0.3.0
scmrepo = 2.0.2
Supports:
azure (adlfs = 2023.10.0, knack = 0.11.0, azure-identity = 1.15.0),
gdrive (pydrive2 = 1.17.0),
gs (gcsfs = 2023.12.2.post1),
hdfs (fsspec = 2023.12.2, pyarrow = 14.0.1),
http (aiohttp = 3.9.1, aiohttp-retry = 2.8.3),
https (aiohttp = 3.9.1, aiohttp-retry = 2.8.3),
oss (ossfs = 2023.12.0),
s3 (s3fs = 2023.12.2, boto3 = 1.28.17),
ssh (sshfs = 2023.10.0),
webdav (webdav4 = 0.9.8),
webdavs (webdav4 = 0.9.8),
webhdfs (fsspec = 2023.12.2)
Config:
Global: /home/sid/.config/dvc
System: /etc/xdg/dvc
Cache types: hardlink, symlink
Cache directory: ext4 on /dev/sda9
Caches: local
Remotes: local
Workspace directory: ext4 on /dev/sda9
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/11de0fd37bc5e8992fc0b3c6878e9759
Additional Information (if any):
18 tests in the tests/func/utils/test_strict_yaml.py file are considered flaky because they consistently report a passing result, even when a RuntimeError is intentionally triggered during the emitting of pretty exceptions.
Related: #9704, https://github.com/iterative/dvc/pull/10110#issuecomment-1841266261, https://github.com/iterative/dvc/pull/10110#discussion_r1436283000
I'll be happy to help resolve this bug. I'll present a solution on how to fix this bug tomorrow.
Closing, this is intentional. pretty_exc is a hack, and we have to be careful working around that.