cpython icon indicating copy to clipboard operation
cpython copied to clipboard

`__getattribute__` does not propagate `AttributeError` error-message if `__getattr__` fails as well

Open randolf-scholz opened this issue 2 years ago • 19 comments

This problem only appears when __getattr__ is implemented:

EDIT: Updated MWE

import pandas as pd

class MWE:
    @property
    def foo(self):
        s = pd.Index([1,2,3])
        return s.iloc[0]   # actual bug! pd.Index has no .iloc attribute

MWE().foo  # AttributeError: 'Index' object has no attribute 'iloc'
# without `__getattr__`  implemented we get the right error message!


class MWE_with_getattr:
    def __getattr__(self, key):
        if key == "dummy":
            return "xyz"
        raise AttributeError(f"{self.__class__.__name__} has no attribute {key}!")

    @property
    def foo(self):
        s = pd.Index([1,2,3])
        return s.iloc[0]   # actual bug! pd.Index has no .iloc attribute


MWE_with_getattr().foo  # AttributeError: MWE_with_getattr has no attribute foo!
# Traceback never mentions "AttributeError: 'Index' object has no attribute 'iloc'"
# Implementing `__getattr__` gobbles up the error message!

Expectation

The traceback should include "AttributeError: 'Index' object has no attribute 'iloc". The issue seems to be that object.__getattribute__ does not re-raise the AttributeError in case the fallback __getattr__ fails as well. According to the documentation, one might expect attribute lookup to work like:

class object:
    def __getattribute__(self, key):
        try:
            attr = self._default_attribute_lookup(key)
        except AttributeError as E:
            try:  # alternative lookup via __getattr__
                attr = self.__getattr__(key)
            except Exception as F:
                raise F from E   # re-raise if __getattr__ fails as well
            else:
                return attr
        else:
            return attr

But instead, the original error gets swallowed if __getattr__ fails as well, i.e. it works more like:

class object:
    def __getattribute__(self, key):
        try:
            attr = self._default_attribute_lookup(key)
        except AttributeError as E:
            # if __getattr__ fails, we are left in the dark as to why default lookup failed.
            return self.__getattr__(key)  
        else:
            return attr

This can be problematic, because:

  1. Attribute lookup can fail due to actual bugs that unintentionally raise AttributeError, for example within the code of a @property
  2. Attributes/Properties might only exist conditionally, and with the current behavior, as the traceback does not include the original AttributeError in case the fallback __getattr__ fails as well, the error message explaining the conditional failure is not passed along.

Actual Output

The code above produces the following output:

Traceback (most recent call last):
  File "/home/rscholz/Projects/KIWI/tsdm/bugs/python/__getattr__attributeerror.py", line 25, in <module>
    MWE_with_getattr().foo  # AttributeError: MWE_with_getattr has no attribute foo!
    ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/rscholz/Projects/KIWI/tsdm/bugs/python/__getattr__attributeerror.py", line 17, in __getattr__
    raise AttributeError(f"{self.__class__.__name__} has no attribute {key}!")
AttributeError: MWE_with_getattr has no attribute foo!

The problem with non-existent .iloc is never mentioned!

Old MWE
from functools import cached_property

class Foo:
    def __getattr__(self, key):
        if key == "secret":
            return "cake"
        raise AttributeError(f"{self.__class__.__name__} has no attribute {key}!")


class Bar(Foo):
    @property
    def prop(self) -> int:
        raise AttributeError("Invisible Error message")

    @cached_property
    def cached_prop(self) -> int:
        raise AttributeError("Invisible Error message")

    filler: str = "Lorem_ipsum"


obj = Bar()
obj.prop  # ✘ AttributeError: Bar has no attribute prop!
obj.cached_prop  # ✘ AttributeError: Bar has no attribute cached_prop!

Expectation

The traceback should include "Invisible Error message". If we replace the AttributeError with, say, ValueError the problem does not occur.

Actual Output

The code above produces the following output:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[11], line 25
     21     filler: str = "Lorem_ipsum"
     24 obj = Bar()
---> 25 obj.prop  # ✘ AttributeError: Bar has no attribute cached_prop!

Cell In[11], line 8, in Foo.__getattr__(self, key)
      6 if key == "secret":
      7     return "cake"
----> 8 raise AttributeError(f"{self.__class__.__name__} has no attribute {key}!")

AttributeError: Bar has no attribute prop!

Your environment

  • python 3.11.3
  • Ubuntu 22.04

randolf-scholz avatar Apr 27 '23 17:04 randolf-scholz

To avoid this, I think property and cached_property need to wrap the function call in another try-except block and re-raise a different error code than AttributeError. e.g.

try:
    val = self.func(instance)
except Exception as E:
    raise RuntimeError("Property evaluation failed!") from E

instead of the plain

https://github.com/python/cpython/blob/a5308e188b810e5cc69c1570bdc9b21ed6c87805/Lib/functools.py#L992

randolf-scholz avatar Apr 27 '23 17:04 randolf-scholz

Not sure if this is a "bug". The doc clearly states that __getattr__ will be called when the default attribute access fails with an AttributeError. Yes, the AttributeError is raised in your code, but it's still an AttributeError and the correct behavior is to call __getattr__ -> which raised the error you saw.

gaogaotiantian avatar Apr 27 '23 17:04 gaogaotiantian

@gaogaotiantian The problem is that the error message doesn't show in the traceback, which makes it difficult to debug. I wrote this after spending quite some time debugging, because I thought there was a problem with my __getattr__, when the real error occurred in the computation of the property. Tracebacks should always go back to the original source of the error, no?

randolf-scholz avatar Apr 27 '23 17:04 randolf-scholz

The AttributeError raised in your property is handled so it won't show in traceback. Think of it as:

try:
    obj.prop()
except AttributeError:
    Foo._getattr__(obj, "prop")

I believe the original design for this is for a "non-exist" attribute, not an AttributeError raised when evaluating the attribute. However, I think this is a well-documented expected behavior.

Your proposal is a breaking change, which would affect code that actually taking advantage of this behavior. For example, I might want property methods that make the attribute "appear to be non-existent" so that it could be handled as a missing attribute.

class A:
    @property
    def data(self):
        if <condition>:
            return ...
        else:
            raise AttributeError()

gaogaotiantian avatar Apr 27 '23 18:04 gaogaotiantian

This is a long-existing and fully documented behavior from before Python 3.5.

Do not raise AttributeError then.

sunmy2019 avatar Apr 30 '23 05:04 sunmy2019

@sunmy2019

Do not raise AttributeError then.

In my case, there was an actual bug in my code, I wasn't intentionally raising an AttributeError during the evaluation of the property (I accidentally used .loc on a pandas.Index object). My point is: if the attribute lookup fails, the traceback should include the error message!

But even when it is intentional, like in the example @gaogaotiantian gave:

class A:
    @property
    def data(self):
        if <condition>:
            return ...
        else:
            raise AttributeError("Failed to load property due to <condition>")

Wouldn't it make sense to include the message "Failed to load property due to <condition>" if __getattr__ fails? I would expect the following behavior:

  • If the default default attribute access fails with an AttributeError(msg), then try __getattr__.
  • If __getattr__ fails as well, include msg in the traceback.
  • Instead, currently we only get the error message from __getattr__!

I believe both you and @gaogaotiantian have misunderstood the point of this issue. The issue is not about how Python's mechanism for looking up attributes works, the issue is that if that mechanism fails, not all relevant error messages are included in the traceback!

randolf-scholz avatar Apr 30 '23 10:04 randolf-scholz

Here is an updated MWE to more clearly demonstrate the issue:

import pandas as pd

class MWE:
    @property
    def foo(self):
        s = pd.Index([1,2,3])
        return s.iloc[0]   # actual bug! pd.Index has no .iloc attribute

MWE().foo  # AttributeError: 'Index' object has no attribute 'iloc'
# without `__getattr__`  implemented we get the right error message!


class MWE_with_getattr:
    def __getattr__(self, key):
        if key == "dummy":
            return "xyz"
        raise AttributeError(f"{self.__class__.__name__} has no attribute {key}!")

    @property
    def foo(self):
        s = pd.Index([1,2,3])
        return s.iloc[0]   # actual bug! pd.Index has no .iloc attribute


MWE_with_getattr().foo  # AttributeError: MWE_with_getattr has no attribute foo!
# Traceback never mentions "AttributeError: 'Index' object has no attribute 'iloc'"
# Implementing `__getattr__` gobbles up the error message!

randolf-scholz avatar Apr 30 '23 10:04 randolf-scholz

  • If __getattr__ fails as well

if it succeeded, then there is no way to indicate this failure of property access.

sunmy2019 avatar Apr 30 '23 11:04 sunmy2019

  • If __getattr__ fails as well

if it succeeded, then there is no way to indicate this failure of property access.

That's outside the scope of this issue.

randolf-scholz avatar Apr 30 '23 11:04 randolf-scholz

@randolf-scholz , first of all, we should all agree that this is not a bug report anymore, this is a feature request now. The code behaves as expected and documented, what you want is something more - a better traceback message.

I agree that in your case, being able to know where the attribute error originated would be helpful. The original improvement of yours (reraising as a different exception) was a breaking change (the code I gave would behave differently) so I don't think that would happen.

Is it possible to make __getattr__ track the original AttributeError and raise from it when there's an exception happening in __getattr__? I believe theoretically yes. However, how would you treat the much more common use cases - the normal attribute missing and AttributeError in __getattr__? You would have two redundent tracebacks - the original AttributeError would not contain any useful information.

I don't think it's worth it to complicate a more common use case, just to make a rare one better.

In this specific case, I think the right path is:

    @property
    def foo(self):
        try:
            s = pd.Index([1,2,3])
            return s.iloc[0]   # actual bug! pd.Index has no .iloc attribute
        except AttributeError as exc:
            raise RuntimeError() from exc

Or, write your own wrapper for properties.

You may argue that the user would have to know that the code in the method could potentially raise AttributeError to be able to do this, that's right and that's what they need to consider when they use __getattr__ - it's documented clearly that the method will be called when there's an AttributeError, and the user is responsible to deal with the unexpected AttributeError.

gaogaotiantian avatar May 01 '23 00:05 gaogaotiantian

@gaogaotiantian

However, how would you treat the much more common use cases - the normal attribute missing and AttributeError in getattr? You would have two redundent tracebacks - the original AttributeError would not contain any useful information.

I don't see any problem in that, and in fact, it can be a very useful thing as your own example demonstrates:

class A:
    def __getattr__(self, key):
        try:
            attr = self._fallback(key)
        except Exception as E:
            raise AttributeError(f"{self} has no attribute {key}!") from E
        else:
            return attr

    @property
    def data(self):
        if <condition>:
            return ...
        else:
            raise AttributeError("Failed to load property due to <condition>")

With current behavior, if self._fallback(key) fails (for example with NotImplementedError), one only gets the message "object has no attribute data!", without any information about why the original <condition> failed. I'd wager that most of the time, this is useful info to have.

randolf-scholz avatar May 01 '23 08:05 randolf-scholz

one only gets the message "object has no attribute data!"

This is in the doc.

object.__getattr__ should either return the (computed) attribute value or raise an AttributeError exception.

I think AttributeError is like a bridge for __getattr__ and normal attributes.

sunmy2019 avatar May 01 '23 08:05 sunmy2019

@sunmy2019

This is in the doc.

I don't see it explicitly described there. When I read the

relevant passage

object.getattr(self, name) Called when the default attribute access fails with an AttributeError (either getattribute() raises an AttributeError because name is not an instance attribute or an attribute in the class tree for self; or get() of a name property raises AttributeError). This method should either return the (computed) attribute value or raise an AttributeError exception.

Note that if the attribute is found through the normal mechanism, getattr() is not called. (This is an intentional asymmetry between getattr() and setattr().) This is done both for efficiency reasons and because otherwise getattr() would have no way to access other attributes of the instance. Note that at least for instance variables, you can fake total control by not inserting any values in the instance attribute dictionary (but instead inserting them in another object). See the getattribute() method below for a way to actually get total control over attribute access.

object.getattribute(self, name) Called unconditionally to implement attribute accesses for instances of the class. If the class also defines getattr(), the latter will not be called unless getattribute() either calls it explicitly or raises an AttributeError. This method should return the (computed) attribute value or raise an AttributeError exception. In order to avoid infinite recursion in this method, its implementation should always call the base class method with the same name to access any attributes it needs, for example, object.getattribute(self, name).

Note This method may still be bypassed when looking up special methods as the result of implicit invocation via language syntax or built-in functions. See Special method lookup. For certain sensitive attribute accesses, raises an auditing event object.getattr with arguments obj and name.

My intuitive expectation would be that the attribute-lookup roughly works as:

class object:
    def __getattribute__(self, key):
        try:
            attr = self._default_attribute_lookup(key)
        except AttributeError as E:
            try:  # alternative lookup via __getattr__
                attr = self.__getattr__(key)
            except Exception as F:
                raise F from E   # re-raise if __getattr__ fails as well
            else:
                return attr
        else:
            return attr

But instead, the original error gets swallowed if __getattr__ fails as well.

class object:
    def __getattribute__(self, key):
        try:
            attr = self._default_attribute_lookup(key)
        except AttributeError as E:
            # if __getattr__ fails, we are left in the dark as to why default lookup failed.
            return self.__getattr__(key)  
        else:
            return attr

randolf-scholz avatar May 01 '23 08:05 randolf-scholz

I can imagine that possibly the nested try-except block was avoided for performance reasons. But since we have "Zero cost" exception handling (#84403) now, that shouldn't be an issue anymore.

randolf-scholz avatar May 01 '23 09:05 randolf-scholz

I think this is a valid and reasonable feature request. There may be reasons not to do it that would become clear in attempting to implement it, but I don't see a priori reasons to dismiss the feature.

The only real objection to the chaining that has been raised so far is this one:

how would you treat the much more common use cases - the normal attribute missing and AttributeError in getattr? You would have two redundent tracebacks - the original AttributeError would not contain any useful information.

But I don't think this is a strong objection. In this case, the newly-chained exception would accurately reflect the situation: a normal attribute access failed, __getattr__ was called, and it also failed with an exception. The traceback would be longer than it is now, but it wouldn't be misleading or confusing. The exception you get currently would still be featured at the "bottom" of the traceback. And in more complex cases, critical debugging information wouldn't be totally hidden.

carljm avatar May 01 '23 17:05 carljm

I agree with @carljm. This isn't a bug, and the the original suggestion of wrapping the AttributeError with a RuntimeError would have broken a lot of code. But the suggestion of chaining the exception is much stronger, and could be a pretty useful feature in some situations.

AlexWaygood avatar May 01 '23 18:05 AlexWaygood

Sure, we can try to build a prototype and see if breaks anything unexpected.

gaogaotiantian avatar May 01 '23 18:05 gaogaotiantian

Just encountered this again, after some very painful debugging I found https://stackoverflow.com/questions/36575068 and remembered that I had this issue before.

user2483412 Provided a good workaround:

def __getattr__(self, name):
    if name == 'abc':
        return 'abc'
    return self.__getattribute__(name)

instead of

def __getattr__(self, name):
    if name == 'abc':
        return 'abc'
    raise AttributeError

I still think that the re-raising should, because this is incredibly frustrating to debug. One could even add a note to consider this implementation.

randolf-scholz avatar Oct 06 '23 21:10 randolf-scholz

Just encountered this again, after some very painful debugging I found https://stackoverflow.com/questions/36575068 and remembered that I had this issue before.

user2483412 Provided a good workaround:

def __getattr__(self, name):
    if name == 'abc':
        return 'abc'
    return self.__getattribute__(name)

I would be wary of this approach, as it basically means the attribute access flow is __getattribute__ -> __getattr__ -> __getattribute__ again. It can make things even more unclear from a development perspective and have unexpected side-effects. Your property getter will now run twice, so any side-effects introduced the first time running the getter may impact how it runs the second time. I've recently discovered a tricky bug with DRF due to this exact type of error handling.

In the bug, during the first attribute access, the property would check if an instance attribute is already set, and if it was, it would be immediately returned. Otherwise, it would run a function to create the value for that attribute. If that function raised an exception, the instance attribute would be set to an empty value and the error would be re-raised. Below is a minimal example:

class Request:

    _data = None

    @property
    def data(self):
        if self._data is None:
            self.load_data()

        return self._data

    def load_data(self):
        try:
            self._data = do_something()
        except:
            self._data = {}
            raise

Now with the flow mentioned above, if an AttributeError was raised, the property getter would run twice (via the two calls to __getattribute__). On the second time, the property getter would now have a valid (but empty and incorrect) value for that attribute to return. This ended up suppressing a bug with my code, which was using DRF, and it made it seem like everything was working as expected when it wasn't.

james-mchugh avatar Jun 29 '24 18:06 james-mchugh