pycapnp icon indicating copy to clipboard operation
pycapnp copied to clipboard

Absolute import causes duplicate ID

Open elagil opened this issue 2 years ago • 9 comments

Hello!

I want to use multiple schema files, where one schema is a parent schema that imports a child schema. I also want to import the child schema directly for use in my Python script. I made a minimum implementation that causes the duplicate ID error. The schemas are located within a schemas subfolder.

The parent schema, located in /schemas/parent.capnp

@0x95c41c96183b9c2f;

using import "/schemas/child.capnp".Child;

struct Parent {
    child @0 :List(Child);
}

The child schema, located in /schemas/child.capnp

@0x9afc0f7513269df3;

struct Child {
    name @0 :Text;
}

The test application, located in /test.py

import capnp
from schemas.parent_capnp import Parent
from schemas.child_capnp import Child

Running the script gives KjException child.capnp:0: failed: Duplicate ID @0x9afc0f7513269df3. on line 3.

What is causing this?

Using Python 3.8.10 on Windows, with pycapnp 1.1.0, same on Ubuntu 20.04 (same Python/pycapnp versions).

elagil avatar Nov 08 '21 16:11 elagil

I have no idea but maybe try to move @0x95c41c96183b9c2f; after the import in parent?

tolidano avatar Jan 26 '22 22:01 tolidano

FWIW it sounds like the system isn't recognizing that import "/schemas/child.capnp" and from schemas.child_capnp import Child are importing the same file. So child.capnp ends up being parsed twice, with the system believing it is parsing two different files, and then it decides that the files conflict because they have the same ID.

kentonv avatar Jan 27 '22 00:01 kentonv

Does this mean that the solution may be to try omitting this line: from schemas.child_capnp import Child

And seeing if the import creates the Child object properly anyway? Is it possible the schema parser ultimately creates the nested objects at the same scope?

tolidano avatar Jan 27 '22 00:01 tolidano

No, if you need to reference Child in your Python then I think you need to import it. But somehow the filenames fed to the C++ parser library are not matching... Unfortunately I'm not familiar enough with pycapnp to diagnose the root cause here, I'm just explaining what's happening on the underlying C++ side.

kentonv avatar Jan 27 '22 00:01 kentonv

Some futzing resulted in this working, although not likely what you want:

@0x95c41c96183b9c2f;

# using import "/schemas/child.capnp".Child;

struct Parent {
    child @0 :List(Child);
}

struct Child {
    name @0 :Text;
}

with

import capnp
from schemas.child_capnp import Child
from schemas.parent_capnp import Parent

parent = Parent.new_message()
child = parent.init("child", 1)[0]
child.name = "Shawn"
print(parent)

But this was the only incantation that produced what you would want here.

tolidano avatar Jan 27 '22 02:01 tolidano

Threw a few debugging statements into the capnp.pyx:

diff --git a/capnp/lib/capnp.pyx b/capnp/lib/capnp.pyx
index dd0f9b8..e89a9d7 100644
--- a/capnp/lib/capnp.pyx
+++ b/capnp/lib/capnp.pyx
@@ -3463,11 +3463,13 @@ cdef class SchemaParser:
 
         ret = _ParsedSchema()
         # TODO (HaaTa): Convert to parseFromDirectory() as per deprecation note
+        print(f"PDF LOADIN {diskPath}")
         ret._init_child(self.thisptr.parseDiskFile(displayName, diskPath, importArray.asArrayPtr()))
 
         return ret
 
     def load(self, file_name, display_name=None, imports=[]):
+        print(f"SP LOADING {file_name}")
         """Load a Cap'n Proto schema from a file
 
         You will have to load a schema before you can begin doing anything
@@ -3569,6 +3571,7 @@ cdef class SchemaParser:
         for imp in imports:
             if _os.path.isdir(imp):
                 filtered_imports.append(imp)
+        print(f"FS LOADING {file_name}")
         fileSchema = parser._parse_disk_file(display_name, file_name, filtered_imports)
         _load(fileSchema, module)
 
@@ -4306,6 +4309,7 @@ def cleanup_global_schema_parser():
 
 
 def load(file_name, display_name=None, imports=[]):
+    print(f"GB LOADING {file_name}")
     """Load a Cap'n Proto schema from a file
 
     You will have to load a schema before you can begin doing anything

and get the same thing over and over:

GB LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/child.capnp
SP LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/child.capnp
FS LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/child.capnp
PDF LOADIN /Users/shawn.tolidano/dev/junk/capnp/schemas/child.capnp
GB LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/parent.capnp
SP LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/parent.capnp
FS LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/parent.capnp
PDF LOADIN /Users/shawn.tolidano/dev/junk/capnp/schemas/parent.capnp

I then created schemas/grandparent.capnp and tried to import GrandParent it after Parent:

> python3 test.py              
GB LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/parent.capnp
SP LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/parent.capnp
FS LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/parent.capnp
PDF LOADIN /Users/shawn.tolidano/dev/junk/capnp/schemas/parent.capnp
GB LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/grandparent.capnp
SP LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/grandparent.capnp
FS LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/grandparent.capnp
PDF LOADIN /Users/shawn.tolidano/dev/junk/capnp/schemas/grandparent.capnp
Traceback (most recent call last):
  File "test.py", line 6, in <module>
    from schemas.grandparent_capnp import GrandParent
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
  File "capnp/lib/capnp.pyx", line 4367, in capnp.lib.capnp._Loader.load_module
  File "capnp/lib/capnp.pyx", line 4346, in capnp.lib.capnp.load
  File "capnp/lib/capnp.pyx", line 3575, in capnp.lib.capnp.SchemaParser.load
capnp.lib.capnp.KjException: /Users/shawn.tolidano/dev/junk/pycapnp/bundled/capnproto-c++/src/capnp/compiler/node-translator.c++:2165: context: member.name = child
schemas/parent.capnp:0: failed: Duplicate ID @0x95c41c96183b9c2f.

Then inverted the imports (so GrandParent first):

> python3 test.py              
GB LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/grandparent.capnp
SP LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/grandparent.capnp
FS LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/grandparent.capnp
PDF LOADIN /Users/shawn.tolidano/dev/junk/capnp/schemas/grandparent.capnp
GB LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/parent.capnp
SP LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/parent.capnp
FS LOADING /Users/shawn.tolidano/dev/junk/capnp/schemas/parent.capnp
PDF LOADIN /Users/shawn.tolidano/dev/junk/capnp/schemas/parent.capnp
Traceback (most recent call last):
  File "test.py", line 6, in <module>
    from schemas.parent_capnp import Parent
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 655, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 618, in _load_backward_compatible
  File "capnp/lib/capnp.pyx", line 4367, in capnp.lib.capnp._Loader.load_module
  File "capnp/lib/capnp.pyx", line 4346, in capnp.lib.capnp.load
  File "capnp/lib/capnp.pyx", line 3575, in capnp.lib.capnp.SchemaParser.load
capnp.lib.capnp.KjException: parent.capnp:0: failed: Duplicate ID @0x95c41c96183b9c2f.

Interestingly, one listed a context, and one did not.

Is it possible that the SchemaParser.load method is being called and no check is being done to ensure that a particular schema isn't already loaded (perhaps by maintaining a list of @0x95 IDs seen)?

tolidano avatar Jan 27 '22 04:01 tolidano

When the C++ schema parser interprets the line import "/schemas/child.capnp", I don't think it calls back into Python. I think it directly looks for the file on disk, using the import path. So your print statements added on the Python side wouldn't see what file it ends up looking for. I suspect it ends up constructing a different filename in this case, which is why it isn't recognized as the same file.

kentonv avatar Jan 27 '22 15:01 kentonv

Changing this: using import "/schemas/child.capnp".Child; to this: using import "child.capnp".Child; eliminates the problem. However, if you wanted to have a folder structure for your schemas, the problem returns.

tolidano avatar Jan 27 '22 18:01 tolidano

Yes. Thus, my current solution is to simply use relative imports, along with some hacky __init__.py files in Python, which make the definitions available:

# __init__.py
import os

import capnp

capnp.remove_import_hook()

here = os.path.dirname(os.path.abspath(__file__))

target_definition_capnp = os.path.abspath(os.path.join(here, "target_definition.capnp"))

TargetDefinition = capnp.load(target_definition_capnp).TargetDefinition

elagil avatar Feb 28 '22 13:02 elagil