Use namedtuple for namespace qualifiers in XML attributes
We should expose the namespace (prefix, uri) qualifiers of XML attributes in the xml.Node.attributes parameter in a more structured manner, perhaps with the use of namedtuples? Right now the namespace prefix is encoded as part of the attribute name, with no option to declare the namespace uri. But it we change the signature to accept namedtuples:
class Node():
def __init__(self, tag: str, nsdefs: list[(?str,str)]=[], prefix: ?str=None,
attributes: list[((name: str, prefix: ?str, uri: ?str), value: ?str)]=[], children: list[Node]=[], text: ?str=None, tail: ?str=None):
Then when encoding an XML, the encoder would keep track of active namespace definitions on the ancestor nodes and then:
- if this is a new namespace not defined in any of the ancestors include the explicit namespace definition in the node that contains the attribute, like
<node xmlns:foo="foo" foo:attr="val /> - if this namespace is defined in an ancestor node, reuse the existing explicit prefix, like
<ancestor xmlns:foo="foo"><node foo:attr="val" /></ancestor>- unless the containing xml.Node has a redundant (repeated) explicit namespace definition in its nsdefs, then declare the explicit namespace prefix again, like
<ancestor xmlns:foo="foo"><node xmlns:foo="foo" foo:attr="val" /></ancestor>
- unless the containing xml.Node has a redundant (repeated) explicit namespace definition in its nsdefs, then declare the explicit namespace prefix again, like
An alternative to named tuples is to use a class QName (other XML libs call it QName), but namedtuples seem simpler and thus more appealing.
Another variant for the structure itself could be:
attributes: list[(name: str, value: ?str, uri: ?str, prefix: ?str)
since name is mandatory, it comes first, then value which is very likely to be set and only then do you add on uri and prefix, if used. This way we don't have to do a nested tupled that mimics a QName class but can keep it a little flatter. WDYT?