allennlp icon indicating copy to clipboard operation
allennlp copied to clipboard

`--include-package` doesn't allow for a subdirectory named `modules`

Open nikhilweee opened this issue 5 years ago • 1 comments

UPDATE: This issue only persists if I call allennlp directly (python3 allennlp/run.py train). If I install allennlp using pip and then try allennlp train the issue doesn't persist.

Describe the bug I'm trying to use allennlp as a library. My package can't have a subdirectory named 'modules'.

To Reproduce

  1. Clone and set up https://github.com/allenai/allennlp-as-a-library-example
  2. Rename my_library/models to my_library/modules. Optional: also take care of the imports in my_library/__init__.py and my_library/models/__init__.py.
  3. Try running the train command as described in the README. Since I haven't pip installed allennlp, I'm calling the run script directly.
python3 ../allennlp/allennlp/run.py train experiments/venue_classifier.json -s /tmp/your_output_dir_here --include-package my_library

Expected behavior The command should run without any issues. Instead, I get the following traceback:

Traceback (most recent call last):
  File "../allennlp/allennlp/run.py", line 21, in <module> 
    run()
  File "../allennlp/allennlp/run.py", line 18, in run
    main(prog="allennlp")
  File "/u/nikhilweee/projects/allennlp/allennlp/commands/__init__.py", line 101, in main
    import_submodules(package_name)
  File "/u/nikhilweee/projects/allennlp/allennlp/common/util.py", line 328, in import_submodules
    for module_finder, name, _ in pkgutil.walk_packages(path):
  File "/opt/share/Python-3.6.2/x86_64/lib/python3.6/pkgutil.py", line 107, in walk_packages
    yield from walk_packages(path, info.name+'.', onerror) 
  File "/opt/share/Python-3.6.2/x86_64/lib/python3.6/pkgutil.py", line 92, in walk_packages
    __import__(info.name)
  File "/u/nikhilweee/projects/allennlp/allennlp/modules/seq2seq_encoders/__init__.py", line 92, in <module>
    Seq2SeqEncoder.register("gru")(_Seq2SeqWrapper(torch.nn.GRU))
  File "/u/nikhilweee/projects/allennlp/allennlp/common/registrable.py", line 48, in add_subclass_to_registry
    name, cls.__name__, registry[name].__name__)
AttributeError: '_Seq2SeqWrapper' object has no attribute '__name__'

System (please complete the following information):

  • OS: RHEL 7.4
  • Python version: 3.6.2
  • AllenNLP version: I installed from master https://github.com/allenai/allennlp/commit/6746d12c51e26ea25c326b40cda4bcf4109b808d
  • PyTorch version: 1.1.0

Additional context The issue only occurs if my package has a subdirectory named 'modules'. On digging further, I found that pkgutil.walk_packages also tries to recursively (re)import any matching modules from allennlp (In this case allennlp.modules). Some of those modules (like allennlp/modules/seq2seq_encoders/__init__.py) have lines like Seq2SeqEncoder.register("gru")(_Seq2SeqWrapper(torch.nn.GRU)), which will naturally throw an error since it's already registered.

nikhilweee avatar Aug 01 '19 14:08 nikhilweee

PR welcome

joelgrus avatar Aug 07 '19 19:08 joelgrus