astroid icon indicating copy to clipboard operation
astroid copied to clipboard

Ast conversion for a dataclass raises DuplicateBasesError which crashes Pylint

Open AleksMat opened this issue 1 year ago • 6 comments

Here is an example that causes Pylint to crash because Astroid raises an unexpected error:

import dataclasses

from typing import TypeVar, Protocol

BaseT = TypeVar("BaseT")
T = TypeVar("T", bound=BaseT)


class ConfigBase(Protocol[BaseT]):
    ...


class Config(ConfigBase[T], Protocol[T]):
    ...


@dataclasses.dataclass
class DatasetConfig(Config[T]):
    ...

This is a valid Python code but running Pylint crashes with the following stack trace:

Traceback (most recent call last):
  File "<path>/venvs/notebook/lib/python3.11/site-packages/pylint/lint/pylinter.py", line 976, in get_ast
    return MANAGER.ast_from_file(filepath, modname, source=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/manager.py", line 167, in ast_from_file
    return AstroidBuilder(self).file_build(filepath, modname)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/builder.py", line 145, in file_build
    return self._post_build(module, builder, encoding)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/builder.py", line 173, in _post_build
    module = self._manager.visit_transforms(module)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/manager.py", line 128, in visit_transforms
    return self._transform.visit(node)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/transforms.py", line 162, in visit
    return self._visit(node)
           ^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/transforms.py", line 84, in _visit
    visited = self._visit_generic(value)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/transforms.py", line 112, in _visit_generic
    return [self._visit_generic(child) for child in node]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/transforms.py", line 112, in <listcomp>
    return [self._visit_generic(child) for child in node]
            ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/transforms.py", line 119, in _visit_generic
    return self._visit(node)
           ^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/transforms.py", line 87, in _visit
    return self._transform(node)
           ^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/transforms.py", line 67, in _transform
    ret = transform_func(node)
          ^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/brain/brain_dataclasses.py", line 84, in dataclass_transform
    init_str = _generate_dataclass_init(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/brain/brain_dataclasses.py", line 251, in _generate_dataclass_init
    prev_pos_only_store, prev_kw_only_store = _find_arguments_from_base_classes(node)
                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/brain/brain_dataclasses.py", line 174, in _find_arguments_from_base_classes
    for base in reversed(node.mro()):
                         ^^^^^^^^^^
  File "<path>/Python/astroid/astroid/nodes/scoped_nodes/scoped_nodes.py", line 2863, in mro
    return self._compute_mro(context=context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/nodes/scoped_nodes/scoped_nodes.py", line 2851, in _compute_mro
    unmerged_mro = clean_duplicates_mro(unmerged_mro, self, context)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/nodes/scoped_nodes/scoped_nodes.py", line 153, in clean_duplicates_mro
    raise DuplicateBasesError(
astroid.exceptions.DuplicateBasesError: Duplicates found in MROs (DatasetConfig), (Config, ConfigBase, Protocol, Protocol, object), (Config) for <ClassDef.DatasetConfig l.19 at 0x7f1bd5191810>.
Can't write the issue template for the crash in <path>/.cache/pylint/pylint-crash-2024-10-28-18-14-28.txt because of: '[Errno 2] No such file or directory: '<path>/.cache/pylint/pylint-crash-2024-10-28-18-14-28.txt''
Here's the content anyway:
First, please verify that the bug is not already filled:
https://github.com/pylint-dev/pylint/issues/

Then create a new issue:
https://github.com/pylint-dev/pylint/issues/new?labels=Crash 💥%2CNeeds triage 📥



Issue title:
Crash ``Building error when trying to create ast representation of module 'crash'`` (if possible, be more specific about what made pylint crash)

### Bug description

When parsing the following ``a.py``:

<!--
 If sharing the code is not an option, please state so,
 but providing only the stacktrace would still be helpful.
 -->

```python
import dataclasses

from typing import TypeVar, Protocol

BaseT = TypeVar("BaseT")
T = TypeVar("T", bound=BaseT)
U = TypeVar("U")


class ConfigBase(Protocol[BaseT]):
    ...


class Config(ConfigBase[T], Protocol[U]):
    ...


@dataclasses.dataclass
class DatasetConfig(Config[T, U]):
    ...

Command used

pylint a.py

Pylint output

pylint crashed with a ``AstroidBuildingError`` and with the following stacktrace:
Traceback (most recent call last):
  File "<path>/venvs/notebook/lib/python3.11/site-packages/pylint/lint/pylinter.py", line 976, in get_ast
    return MANAGER.ast_from_file(filepath, modname, source=True)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/manager.py", line 167, in ast_from_file
    return AstroidBuilder(self).file_build(filepath, modname)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/builder.py", line 145, in file_build
    return self._post_build(module, builder, encoding)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/builder.py", line 173, in _post_build
    module = self._manager.visit_transforms(module)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/manager.py", line 128, in visit_transforms
    return self._transform.visit(node)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/transforms.py", line 162, in visit
    return self._visit(node)
           ^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/transforms.py", line 84, in _visit
    visited = self._visit_generic(value)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/transforms.py", line 112, in _visit_generic
    return [self._visit_generic(child) for child in node]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/transforms.py", line 112, in <listcomp>
    return [self._visit_generic(child) for child in node]
            ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/transforms.py", line 119, in _visit_generic
    return self._visit(node)
           ^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/transforms.py", line 87, in _visit
    return self._transform(node)
           ^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/transforms.py", line 67, in _transform
    ret = transform_func(node)
          ^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/brain/brain_dataclasses.py", line 84, in dataclass_transform
    init_str = _generate_dataclass_init(
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/brain/brain_dataclasses.py", line 251, in _generate_dataclass_init
    prev_pos_only_store, prev_kw_only_store = _find_arguments_from_base_classes(node)
                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/brain/brain_dataclasses.py", line 174, in _find_arguments_from_base_classes
    for base in reversed(node.mro()):
                         ^^^^^^^^^^
  File "<path>/Python/astroid/astroid/nodes/scoped_nodes/scoped_nodes.py", line 2863, in mro
    return self._compute_mro(context=context)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/nodes/scoped_nodes/scoped_nodes.py", line 2851, in _compute_mro
    unmerged_mro = clean_duplicates_mro(unmerged_mro, self, context)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/Python/astroid/astroid/nodes/scoped_nodes/scoped_nodes.py", line 153, in clean_duplicates_mro
    raise DuplicateBasesError(
astroid.exceptions.DuplicateBasesError: Duplicates found in MROs (DatasetConfig), (Config, ConfigBase, Protocol, Protocol, object), (Config) for <ClassDef.DatasetConfig l.19 at 0x7f1bd5191810>.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<path>/venvs/notebook/lib/python3.11/site-packages/pylint/lint/pylinter.py", line 716, in _get_asts
    ast_per_fileitem[fileitem] = self.get_ast(
                                 ^^^^^^^^^^^^^
  File "<path>/venvs/notebook/lib/python3.11/site-packages/pylint/lint/pylinter.py", line 998, in get_ast
    raise astroid.AstroidBuildingError(
astroid.exceptions.AstroidBuildingError: Building error when trying to create ast representation of module 'crash'

Expected behavior

No crash.

Pylint version

pylint 3.3.1
astroid 3.3.5
Python 3.11.9 (main, Jun 19 2024, 00:38:48) [GCC 13.2.0]

OS / Environment

linux (Linux)

Additional dependencies

************* Module crash
pylint/crash.py:1:0: F0002: pylint/crash.py: Fatal error while checking 'pylint/crash.py'. Please open an issue in our bug tracker so we address this. There is a pre-filled template that you can use in '<path>/.cache/pylint/pylint-crash-2024-10-28-18-14-28.txt'. (astroid-error)

Note that this only happens for a dataclass because brain_dataclasses.py calls node.mro() which can raise the error and the error is never caught.

AleksMat avatar Oct 28 '24 18:10 AleksMat

Feels to me like we should fix: https://github.com/pylint-dev/astroid/blob/c15d6b260132febd022f50d3a40df0825da5ea4d/astroid/nodes/scoped_nodes/scoped_nodes.py#L143

Not by telling it to ignore duplicates, but by making it understand that some things aren't duplicates. Looking at the stack it seems to complain about Protocol. That makes sense as it probably doesn't recognise that it is a bit of weird class that can occur multiple times but is the same class.

DanielNoord avatar Oct 30 '24 20:10 DanielNoord

I agree that improvements in recognizing non-duplicates in scoped_nodes.py are also required to properly solve this issue. But currently clean_duplicates_mro logic is reached twice in the runtime process:

  • During code parsing from the brain_dataclasses.py module. Nothing in this path ever catches or handles any errors. That is why the crash is happening.
  • During code checking where Pylint checks for consistent mro, catches DuplicateBasesError, and converts it into duplicate-bases Pylint error.

Therefore, only changing/improving clean_duplicates_mro would mean that either duplicate-bases would never be caught again for dataclasses or that there will still be crashes. So whatever the fix will be, something definitely has to change in the brain_dataclasses.py module as well. Either it has to catch and handle potential errors or tell scoped_nodes.py to ignore them.

Another example to consider is the comparison between the following code snippets:

class Duplicates(str, str):
  pass

and

import dataclasses

@dataclasses.dataclass
class Duplicates(str, str):
  pass

For the first one Pylint finishes linting and reports duplicate-bases error, for the second one Pylint crashes because brain_dataclasses.py doesn't handle DuplicateBasesError. Neither of the snippets is actually a valid Python code because of a TypeError at runtime. But I assume crashes are never a desired outcome, right?

AleksMat avatar Oct 31 '24 10:10 AleksMat

Hmm yeah I understand the problem a little better now.

@jacobtylerwalls do you have an opinion here? I have tried all day but couldn't come up with a better solution than this boolean flag, but I don't really like it.

DanielNoord avatar Oct 31 '24 20:10 DanielNoord

My thought is that brain_dataclasses.py should be catching MroError when it calls .mro(). That's the pattern I see elsewhere.

jacobtylerwalls avatar Nov 01 '24 02:11 jacobtylerwalls

But here we actually want to compute the mro. If we were to catch the exception there is no way for us to determine the correct mro of the class and can't construct it in the brain.

DanielNoord avatar Nov 01 '24 07:11 DanielNoord

That seems fine to me. This is an edge case, no? And there are other errors that subclass MroError we should catch also, I think.

jacobtylerwalls avatar Nov 01 '24 10:11 jacobtylerwalls