thriftpy2 icon indicating copy to clipboard operation
thriftpy2 copied to clipboard

Local struct shadows imported module

Open FrankPortman opened this issue 3 months ago • 9 comments

When a local struct has the same name as an imported module (e.g., local struct named 'user_types' when there's an imported module 'user_types.thrift'), thriftpy2 incorrectly resolves ALL qualified references like 'user_types.UserProfile' to the local struct instead of accessing the struct from the imported module.

This makes it impossible to reference structs from an imported module when there's a naming collision with a local struct.

EXPECTED BEHAVIOR:

  • 'user_types.UserProfile' should resolve to UserProfile from user_types module
  • 'user_types' (unqualified) should resolve to the local struct

ACTUAL BEHAVIOR:

  • Both resolve to the local 'user_types' struct
  • The qualified path is completely ignored

Ref files/tests:

// user_types.thrift
struct UserProfile {
    1: required string user_id
    2: required string username
    3: optional string email
}
// main.thrift

include "user_types.thrift"

struct UserProfile { 
    1: required string local_field1
    2: optional i32 local_field2
}

struct ApplicationData {
    // Case 1a: Reference to local struct that has same name as imported module
    1: required user_types localUserTypes

    // Case 1b: SHOULD reference user_types.UserProfile from imported module
    // BUT will incorrectly resolve to local user_types struct
    2: required user_types.UserProfile importedUserProfile
}

However it is okay (and it is expected to be okay) if the imported struct collides with the name of a local struct. It seems the issue only occurs if the imported module collides with the name of a local struct.

FrankPortman avatar Sep 18 '25 16:09 FrankPortman

Hi @FrankPortman, thank you for reporting this issue, and sorry for the late reply.

I want to use the code in your PR's test as an example. Let's say we have two thrift files:

name_collision_imported.thrift:

// This file is imported and has a struct named UserProfile
struct UserProfile {
    1: required string user_id
    2: required string username
    3: optional string email
}

name_collision.thrift:

// Test case for when a local struct has the same name as an imported module
include "name_collision_imported.thrift"

// Local struct with the same name as the imported module
struct name_collision_imported {
    1: required string local_field1
    2: optional i32 local_field2
}

struct TestStruct {
    // Should resolve to the local struct 'name_collision_imported'
    1: required name_collision_imported localStruct

    // Should resolve to UserProfile from the imported module
    2: required name_collision_imported.UserProfile importedUserProfile
}

What you expected is to refer to TestStruct.localStruct as the local struct name_collision_imported, and to refer to TestStruct.importedUserProfile as a struct named UserProfile, which is defined in the name_collision_imported file that has the same name as the local struct name_collision_imported.

But I think this is strange behavior. You can think of it like importing a Python module foo, then defining a local variable foo under the import statement. After that, you want to refer to the type foo directly as foo, and to refer to foo.bar as a type defined in foo.py. Python doesn’t allow this kind of usage because foo is already shadowed.

I tried Apache Thrift with these files:

thrift --gen py name_collision_imported.thrift
thrift --gen py name_collision.thrift

And I found that although the compiler allows this kind of usage, importing it leads to an error:

>>> from name_collision import ttypes
Traceback (most recent call last):
  File "<python-input-1>", line 1, in <module>
    from name_collision import ttypes
  File "/tmp/xxx/gen-py/name_collision/ttypes.py", line 172, in <module>
    (2, TType.STRUCT, 'importedUserProfile', [name_collision_imported.ttypes.UserProfile, None], None, ),  # 2
                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: type object 'name_collision_imported' has no attribute 'ttypes'

It has the same behavior in Thriftpy, where name_collision_imported is shadowed by the local struct name_collision_imported.

As a result, I think the current behavior is consistent with Python and Apache Thrift, so we should keep it. That said, I’m also open to being convinced otherwise by different reasons.

aisk avatar Sep 22 '25 16:09 aisk

That is a good point @aisk. I admit I did not actually check with the main Apache Thrift compiler.

My expectation came from Scrooge/Scala, which can gracefully tolerate this thanks to Scala’s namespaces.

If ThriftPy2 special-cased dotted names to bypass shadowing, we’d diverge from Apache Thrift/Python and create schemas that ‘work in ThriftPy2’ but fail under the official compiler. That’s risky for multi-lang users.

WDYT about one of the following:

  • parse error in thriftpy2 (to mimic what you then see with Apache Thrift)
    • for my personal use case this would be okay even if diverges from Scrooge/Scala since we typically define the codegen together. So it is a little confusing if it will only trigger from the Python side, but at least it is automatically caught.
  • parse error but opt-in via flags
  • allow the behavior i am describing but follow something like https://github.com/Thriftpy/thriftpy2/pull/324#issuecomment-3317786717 suggestion and making it more of a "use/catch at your own risk" based on the parsing. Possibly with default resolution being consistent with the current behavior, but allowing downstream callers to check other attrs.

All solutions are a little bit awkward because even the status quo technically is not compliant with Apache Thrift based on what you just posted.

FrankPortman avatar Sep 22 '25 17:09 FrankPortman

@aisk any thoughts re the above?

FrankPortman avatar Sep 24 '25 22:09 FrankPortman

The import error of the native Thrift is also confusing. I think that when the Thrift file is parsed by the parser in thriftpy2, if there is a module attribute overwrite, we can throw an exception with a clear error reason.

cocolato avatar Sep 26 '25 03:09 cocolato

https://github.com/Thriftpy/thriftpy2/blob/8e82f1c091a9d9db30dc01af25431d06056c3d88/thriftpy2/parser/parser.py#L619-L621 We can extend the ModuleType.

class ThriftModule(ModuleType):

    def __setattr__(self, name: str, value: Any) -> None:
        if getattr(self, name, None):
            raise AttributeError("...")
        return super().__setattr__(name, value)

cocolato avatar Sep 26 '25 03:09 cocolato

Sounds good to me. I can take a stab at adding the parse-time error—unless you already have an approach in mind?

On the “why support this” point: even if Apache Thrift’s generated Python fails at import time, thriftpy2 is an IDL loader/implementation rather than a codegen mirror. Since the parser can distinguish qualified names, we could make this behavior opt-in without violating the IDL (Scrooge handles it fine for Scala). That gives users three choices:

  • strict Apache – reject the schema at parse-time if a local type name collides with an included module
  • support collision – my PR

I’m happy to wire up the error mode first and, if there’s interest, follow with an opt-in qualified_wins resolver plus warnings.

FrankPortman avatar Sep 30 '25 13:09 FrankPortman

I'd like to raise an AttributeError for this case to match the behavior of Apache Thrift.

For better error information, we can add a message like "Did you mean ...?" just like the built-in errors introduced in Python 3.10: https://docs.python.org/3/whatsnew/3.10.html#attributeerrors, but this is not required, and if the implementation of the better error message is too complicated, I'd prefer not to add it, not only due to performance concerns, but also because of maintenance complexity.

aisk avatar Oct 02 '25 13:10 aisk

@aisk do you have bandwidth or are you already actively working on this? I probably can't contribute any new code for a few weeks at least.

FrankPortman avatar Oct 06 '25 20:10 FrankPortman

I haven't been working on this recently. When you have time, feel free to check if anyone has completed it before starting on it. No pressure at all!

aisk avatar Oct 07 '25 14:10 aisk