pyp
pyp copied to clipboard
Submodule Import Problems
Suppose I have a string of HTML content and would like to extract certain information from it:
pyp "xml.etree.ElementTree.fromstring('<html><head><title>Title</title></head></html>').find('head/title').text"
Even though pyp
tried to import xml
, there will still be AttributeError: module 'xml' has no attribute 'etree'
because of xml.etree.ElementTree
’s submodule structure.
I can explicitly use -b
parameter for proper importing:
pyp -b "import xml.etree.ElementTree" "xml.etree.ElementTree.fromstring('<html><head><title>Title</title></head></html>').find('head/title').text"
# Title
However, if I add the same line to the PYP_CONFIG_PATH
config file, the same AttributeError
happens still.
cat $PYP_CONFIG_PATH
# import xml.etree.ElementTree
pyp "xml.etree.ElementTree.fromstring('<html><head><title>Title</title></head></html>').find('head/title').text"
# AttributeError: module 'xml' has no attribute 'etree'
So, the question is:
What is the correct way to have xml.etree.ElementTree
imported automatically?
Thanks for the issue!
Yeah, it's hard to know statically what to import when you see an expression like "xml.etree.ElementTree". Not sure I see a way to get that to work out of the box without special casing.
Hm, adding that line to your config should work... we should treat the config element "import xml.etree.ElementTree" as defining "xml" and so statically include it in the code we execute. It looks like that's not happening and so it's falling back to the generic import missing things code. This is a bug, I can fix it.
In the meantime, a workaround could be something like adding from xml.etree import ElementTree
to your config and using ElementTree
. Similarly, adding import xml.etree.ElementTree as ET
to your config and using ET
would also work.
The commit that I just pushed https://github.com/hauntsaninja/pyp/commit/a3f2ebcf3ad48de1399b90c7f1029a3045fea0d1 makes your config example work. I'll see if I can think of improvements that would make your initial version work as well (that are compatible with pyp's mostly static analysis)
Wow, thank you for the quick fix.
I searched the internet for Python module resolution and found importlib.util.find_spec
.
Not sure if it can be used.
Say, if I want to use another call xml.dom.minidom.parse(...)
, is it possible to search level by level?
from importlib.util import find_spec
def spec_of(target):
spec = find_spec(target)
return (spec, spec.submodule_search_locations)
# spec_of('xml.dom.minidom.parse')[1] # Exception, __path__ not found on xml.dom.minidom (parent?)
# ModuleNotFoundError: __path__ attribute not found on 'xml.dom.minidom' while trying to find 'xml.dom.minidom.parse'
spec_of('xml.dom.minidom')[1] is not None # False, no submodule? (pure guess)
spec_of('xml.dom')[1] is not None # True, has submodule? (ditto)
spec_of('xml')[1] is not None # True, has submodule? (ditto)
Yeah, let me think about how to better make some of this stuff work. Not straightforward given the current implementation and static constraints.
In the meantime, imports like import xml.etree.ElementTree as ET
/ from xml.etree import ElementTree
will work without issue.